But Y would I want to do a thing like this?

(This is a snapshot of my old weblog. New posts and selected republished essays can be found at raganwald.com.)

Tuesday, February 06, 2007

Choose life. Choose a job. Choose a starter home. Choose dental insurance, leisure wear and matching luggage. Choose your future. But Y would I want to do a thing like that?

Writing about first-class functions and their compatibility with object-oriented programming naturally leads to the Y combinator. And that is the point where eyes glaze over and soft, snoring sounds rise from RSS readers everywhere.

But please bear with me, this essay is not really about the Y combinator, it’s about learning new things and expanding our capacity to think.

Sharpening the saw

Years ago I picked up Steven Covey’s book The 7 Habits of Highly Effective People. If the book is a test out of seven, I really wasn’t doing very well.

If you’ve read the book, you probably remember that he talked about “Sharpening the Saw,” investing in your own abilities. That’s incredibly important, but I don’t need to tell you that. If you exercise with programming katas, or learn a new programming language once a year, or pick up a book like The Reasoned Schemer and actually go through the exercises, then you are already in the top 1% of software developers for personal skills improvement. (Sorry, certifications don’t count. They are the classic case of doing the wrong thing for the wrong reason!)

New ideas—by which I mean, new to you—are an important way to sharpen your saw. If for no other reason than this: the brain needs regular exercise to perform at or near its potential. Learning new things keeps you sharp, even if you don’t directly use the things you learned.

Others have suggested that learning Lisp is beneficial to your programming skills in its own right. That’s one good way to sharpen your saw. But I add to that an important caveat: to obtain the deepest benefit from learning a new language, you must learn to think in the new language, not just learn to translate your favourite programming language syntax and idioms into it.

Think different

The interesting thing about that is that almost by definition, if you see something in, say, Lisp that solves a problem you already have, you won’t learn much from the Lisp code. It is tempting to think that Lisp (or any other language) will somehow do what you’re already doing in some wonderfully magic way that is obviously better. But no, that isn’t how it really works.

For your problems are tuned to your existing tools. You simply can’t imagine problems that your tools can’t solve well, much less can’t solve at all. That’s why there are so few continuation-based web servers. Who’s going to invent one unless they have a programming paradigm with continuations?

To Mock a Mockingbird: The most enjoyable text on the subject of combinatory logic ever written. What other textbook features starlings, kestrels, and other songbirds?

And worse, when a new tool is applied to a problem you think you know well, you will probably dismiss the things the new tool does well. Look at how many people dismiss brevity of code. Note that all of the people ignore the statistics about the constant ratio between bugs and lines of code use verbose languages. Look at how many people dismiss continuation-based servers as a design approach. Note that all of them use programming languages bereft of control flow abstractions.

Thus, to truly learn a new tool, you must not just learn the new tool, you must apply it to new kinds of problems. It’s no good learning to replace simple iteration with first-class functions: all you’ve learned is syntax. To really learn first-class functions, you must seek out problems that aren’t easily solved with iteration.

The Why of Y

Which leads me back to fixed point combinators. They appear to have no practical (as in making money) use. And that’s why I’m suggesting to you that you figure out how to make one in your language of choice. The very fact that the problem is far outside of your realm of “practicality” guarantees that you will learn something. You won’t be simply applying your same-old, same-old techniques and patterns to a slightly new problem.

Start your research with Richard P. Gabriel’s The Why of Y. Try porting his examples directly to your favourite programming language. If what you want to use is too brain-damaged to support closures, you may need to do a little greenspunning and build a little functor infrastructure.

Don’t be dissuaded if you have to follow the functor route: you are learning far more about your language and about programming in general than the shmoes that settle for learning five new buzzwords related to the latest WS-* interoperability with XPath 3.x.

If you prefer a fun approach to learning, you can do not better than Raymond Smullyan’s To Mock a Mockingbird: an enjoyable romp through the world of combinatory logic. After reading this book, you will have mastered the S, K, I, Y, and other combinators. Added bonuses include a safe that can only be opened by applying Gödel’s Incompleteness Theorem to its combination. How can you read this book and not learn?

Eating my own dog food

I thought of a few things to say along these lines last week and then I abruptly realizing I was asking you to “Do as I say, not as I do.” What good is recycling problems I first encountered in University textbooks two decades ago? I put this post aside and set to work on a problem of my own.

I set out to write a function for making recursive functions—a function extensionally equal to the Y combinator—in Ruby. The ultimate goal is to take something like:

lambda { |f| lambda { |n| n.zero? && 1 or f.call(n-1) } }

And be able to have it call itself recursively. In this case, to compute the factorial function.

This is trivial, given that Ruby supports named recursion, but if you want to write a fixed-point combinator you want to write a function that makes recursive functions without using the host language’s support for named recursive calls. In other words, you are bootstrapping named recursion out of anonymous first-class functions.¹

There are important theoretical implications of being able to do this, but the killer reason to try it is to learn.

I started my quest for a function for making recursive functions with a rather trivial observation based on OO programming and the Curry function:

require 'test/unit'

class ExempliGratia < Test::Unit::TestCase

  CURRY = lambda { |f, a| lambda { |*b| f.call(a, *b) } }

  def test_recursive_curry
    maker = lambda { |func_with_me|
      CURRY.call(func_with_me, func_with_me)
    }
    assert_equal(120, maker.call(lambda { |me, n| n.zero? && 1 or n * me.call(me, n-1) }).call(5))
  end

end

In OO some language implementations, this (or self) is a hidden parameter passed to each method. Thus, there’s a parameter—me in the example code—that is added for handling recursion. If you write a recursive function—like the venerable factorial—with the extra me parameter, a trivial currying operation evaluates it recursively without any need for names.

This is obviously deficient. As noted above, we want to write factorial like so:

lambda { |n| n.zero? && 1 or f.call(n-1) }

We’ll need an f from somewhere, and just as our Scheme colleagues do, we’ll bind one as a parameter in an enclosing lambda. So we want to write:

lambda { |f| lambda { |n| n.zero? && 1 or f.call(n-1) } }

And somehow this should be transformed into a working factorial function. For the test-driven crowd, we want to write:

def test_clean_up_loose_ends
  maker = ...

  factorial = maker.call(
    lambda { |f| lambda { |n| n.zero? && 1 or n * f.call(n-1) } }
  )
  assert_equal(120, factorial.call(5))

  iterative_factorial = maker.call(
    lambda { |f| lambda { |n, acc| n.zero? && acc or f.call(n - 1, n * acc) } }
  )
  tail_factorial = lambda { |n| iterative_factorial.call(n, 1) }
  assert_equal(120, tail_factorial.call(5))
end

Of course, we need some code for maker. And the iterative_factorial case shows that maker works for functions with more than one parameter. The solution I came up with is:

CURRY = lambda { |f, a| lambda { |*b| f.call(a, *b) } }
maker = lambda { |f|
       lambda { |func_with_me| CURRY.call(func_with_me, func_with_me) }.call(
            CURRY.call(lambda { |inner_func, me, *args|
                        inner_func.call(CURRY.call(me, me)).call(*args) }, f)) }

The source code with each transformation from beginning to end is here (I strongly suspect that this “curry combinator” is actually the Y combinator with a huge amount of cruft hanging off it).

Unique or derivative, crap or craft, the process of getting it to work has enriched my mind by forcing me outside of my usual problem space. I still can’t think of a practical application for what I’ve just written. But I know I’ve stretched myself.

And now back to you: perhaps you’re rushing off to try to implement a fixed-point combinator from first principles. Perhaps your plan is to code the canonical examples in your usual language. Those are both good paths. But whether you follow them today or not, remember the underlying principle exemplified by the fixed-point combinator:

Do not dismiss impractical or weird problems. While you may not have an immediate application for the code you write to solve such problems, you are maximizing your learning when you venture outside of your usual problem space.

Named recursion is stuff like foo = lambda { |...| foo.call(:bar) }. It takes advantage of the host language’s variable binding to recurse. If you want anonymous recursion, you should be able to assign the same lambda to another name and have it work just as well, as in: fufu = lambda { |...| foo.call(:bar) }. That won’t work if you are relying on Ruby’s name for foo.

p.s. Don’t miss Tom Moertel’s derivation of the Y combinator in Ruby.

Labels: lispy, passion, popular, ruby

¶ 12:06 AM

Comments on “But Y would I want to do a thing like this?”:

I should have avoided the temptation, but noooooo I just had to check Google Reader on the way to bed. And then I stumbled upon your irresistible Y-combinator post. And then instead of sleeping I started writing code.

It's all your fault. ;-)

Anyway, Haskell makes for a tiny Y combinator:

y f = f (y f)

I wasn't able to make a Ruby version that small, but what I ended up with was satisfyingly concise:

def y(&f)
lambda { |*args| f[y(&f)][*args] }
end

It works pretty much like you would expect:

fac = y { |rec| lambda { |n| n < 2 ? 1 : n * rec[n-1] } }
acc_fac = y { |rec| lambda { |n,acc| n < 2 ? acc : rec[n-1, n*acc] } }
tail_fac = lambda { |n| acc_fac[n, 1] }

fac[5] # ==> 120
tail_fac[5] # ==> 120

Cheers! --Tom

P.S. Sorry about the formatting. I couldn't figure out how to quote code. Feel free to edit the post to fix things up.

# posted by

Tom Moertel : 2:36 AM

Tom:

Your concise version seems to involve a recursive definition: the method y seems to call itself in its body.

Can you rewrite it to eliminate the self-call?

Note that the canonical Y combinator and the curry combinator do not call themselves.

# posted by

Reginald Braithwaite : 8:18 AM

What a coincidence! I just started reading "To Mock a Mockingbird" last night, based on an earlier recommendation you made.

My first reaction: I find it surprisingly hard, even compared to Smullyan's other puzzle books. But it's very interesting, and it will obviously provide an alarmingly thorough grounding in combinatorial logic.

Oddly, I've never learned to love puzzles. Exotic programming languages? Sure. Code katas? Useful. Strange math? I'm finally learning to love it, and it's a lot of fun.

But puzzles have never really clicked for me. And so my interviewing style skews the same way: Coding exercises, design problems, project management questions, but no logic puzzles.

Questions for the readers: Is the love of puzzles something you were born with? Or something you learned? In your experience, is puzzle-solving ability a reliable indicator programming skill? What about the other way around?

# posted by

emk : 8:44 AM

Reg: Can you rewrite it to eliminate the self-call?

Not in Haskell, at least not without getting some data structures involved.

The problem: You can't write the U combinator in Haskell (it won't typecheck), so there's no way to get "fix e = (U I) (U e)" to expand.

Haskell corresponds to the strongly-typed lambda calculus, and the only fixpoints are those provided by the language itself, either in Tom's "y f = f (y f)", or in "data" definitions.

I've heard you can build a Y-combinator of sorts using the latter, but never actually tried it.

# posted by

emk : 8:55 AM

Eric: I'm the same way about puzzles. The Y combinator was a mildly interesting toy to me, sort of like a Rubik's cube or a crossword puzzle. I could play with it for a few minutes, but it couldn't really hold my attention.

What really put me over the top about learning it was when I started writing a parser combinator library. If you write those combinator as recursive functions, then left-recursion will make the parser loop, and I found that really inelegant. But if you write a customized recursion combinator, your recursive parser can check to ensure you're making progress through the stream, and abort if you start to loop. Writing such a combinator requires understanding how the Y combinator works, and that gave me the energy to really dig into it.

My learning methodology is to read very broadly but shallowly to get a big store of ideas I've heard about (as contrasted with ideas I really understand). Then, when I write some code that seems inelegant to me, I'll try and pattern match against the things I've read about, and rummage for some applicable-sounding ideas. Then, I'll study them in depth to figure out how to write something more general.

I think this is a less common style among hacker types, who generally find intrinsic excitement in puzzles. For example, this year's ICFP contest was an enormous home-run hit, but it was one that left me a little cold (even though the guys who did it put in an awesome amount of effort to make it cool).

# posted by

Neel Krishnaswami : 11:44 AM

Ah. I didn't realize the part of the challenge was to limit myself to lambda-calculus features. (I'm still going to use Ruby's splat operator, however.) Alright, then, let's work it through...

Here's my original Ruby definition:

def y(&f)
lambda { |*args| f[y(&f)][*args] }
end

To get rid of the recursive call y(&f), let's replace it with a variable yf, whose value we'll say is magically equivalent to the result of the recursive call. To effect the binding of yf, we'll need to wrap the method body in a new lambda expression and pass in the magic value:

def y(&f)
lambda { |yf|
lambda { |*args| f[yf[yf]][*args] }
}[ <<< placeholder for yf: we'll fill this hole next >>> ]
end

So, what value should we pass in for yf? Because the variable represents the value that would have been computed by calling y(&f), which is just the body of the y method, we can fill the hole with a duplicate of the body:

def y(&f)
lambda { |yf|
lambda { |*args| f[yf[yf]][*args] }
}[ lambda { |yf| lambda { |*args| f[yf[yf]][*args] } } ]
end

And we have our Y combinator:

fac = y { |rec| lambda { |n| n < 2 ? 1 : n * rec[n-1] } }
fac[5] # ==> 120

While the combinator works, its implementation repeats itself: there are two copies of the lambda expression

lambda { |yf| lambda { |*args| f[yf[yf]][*args] } }

in the method body. To factor them out, let's let the variable x represent this expression. Then the method body simplifies to x[x], which is very satisfying. To set x to the correct value, however, we need to add a wrapper lambda that binds x to the original expression. Putting it all together, we arrive at the final version:

def y(&f)
lambda { |x| x[x] } [
lambda { |yf| lambda { |*args| f[yf[yf]][*args] } } ]
end

And it, too, works:

acc_fac = y { |rec| lambda { |n,acc| n < 2 ? acc : rec[n-1, n*acc] } }
tail_fac = lambda { |n| acc_fac[n, 1] }
tail_fac[5] # ==> 120

Cheers! --Tom

# posted by

Tom Moertel : 11:52 AM

Without having looked at the problem too deeply, it occurs to me that the Y Combinator is basically the same problem as a program that reproduces its own source code.

# posted by

Damien : 12:13 PM

This is fun. :)

Here's an JavaScript implementation:
// helper function to convert arguments object to an array
function itoa(i,start) {
var a=[];
for(var x = start ? start : 0 ; x < i.length; x++) a.push(i[x]);
return a;
};
// another helper to implement currying
function curry(func, arg) {
return function() {
return func.apply(Function(), [arg].concat(itoa(arguments)));
}
};
// And here's Y, we need those helpers here
function Y(f) {
return function(h) {
return curry(h, h);
}( curry(function(f, self) {
return f(curry(self, self)).apply(Function(), itoa(arguments,2));
}, f)
)
};
// Just to show it works
fact = Y( function (recursion) { return function(n) { return n < 2 ? 1 : n*recursion(n-1); } });
print(fact(5)); // prints 120

And Python:
# Try to use Python 2.5's builtin curry
try: from functools import partial
except ImportError: partial = lambda f, a: lambda *args: f(a, *args)
# Y
def Y(f):
return (lambda h: partial(h,h ))(
partial(lambda f, self, *args: f(partial(self,self))(*args), f) )
# Works
fact = Y(lambda self: lambda n: n < 2 and 1 or n*self(n-1))
print fact(5) # 120

# posted by

Ants Aasma : 3:08 PM

Ants:

May I repost yur solutions? if so, do you have an URL for attribution?

# posted by

Reginald Braithwaite : 3:12 PM

Sure. I don't have an URL though.

# posted by

Ants Aasma : 3:43 PM

"You simply can’t imagine problems that your tools can’t solve well, much less can’t solve at all."

Interesting take on the Sapir-Whorf hypothesis there.

# posted by

Adam : 6:04 PM

Nice reference, I hope you have better friends than Mark Renton.

# posted by

David Seruyange : 7:17 PM

Worth reading:

Hrmmm.

# posted by

Reginald Braithwaite : 10:51 PM

I wrote a similar post about building "basic" programming abstractions -- booleans, integers, lists and recursion -- in the untyped lambda calculus using a subset of scheme. Take a look.

# posted by

Cotton Seed : 2:36 AM

Here's my version of Y in Ruby:

Y = proc {
|f|

proc {
|g|
g[g] }.call(
proc {
|h|

proc {
|n|

(f[h[h]])[n]
}
})
}

Using Y to define factorial:

fac = Y[
proc {
|h|

proc {
|n|
if n < 2 then
1
else
n * h[n - 1]
end
}
}]

puts fac[5] # prints 120

Admittedly it's not as general as your Y: the recursvie function defined takes one and only one argument. But this should be easy to fix using *args forms.

You can read my blog entry to see how I derived this implementation:

http://www.feiling.org/blogentries/show/16

# posted by

Fei Ling : 5:40 PM

I was prompted by this post to explore a similar topic in scheme See my blog entry

# posted by

Sean Fritz : 5:07 PM

> That’s why there are so few continuation-based HTTP servers.

That's funny, I thought it was because HTTP is a stateless protocol. =)

Great post though.

# posted by

Phil : 4:27 PM

<< Home