Really useful anamorphisms in Ruby
Really simple anamorphisms in Ruby introduced a very simple
unfold. Its chief characteristics were that it generated an Array from a value of some sort, and it did so by applying an incrementor block to its seed recursively until it generated nil. For example:
10.class.unfold(&:superclass)
=> [Fixnum, Integer, Numeric, Object]
A very simple modification allows us to separate the two blocks with a :while or :to pseudo-keyword, and to add a :map keyword for transforming the state into the desired result. Thus, this really simple unfold:
1.unfold(&'_+1 unless _==10').map(&'**2')
=> [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
Can also be expressed as:
1.unfold(:to => '==10', :map => '**2', &'_+1')
=> [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
The latter form is helpful once unfolds become larger and more complex than these simple one-liners.
(There is another style of writing :unfold, using method chaining and lazy evaluation to eliminate lambda keywords, but we will save that for another time: it is a great examination of syntax but does not change :unfold’s fundamental behaviour.)
Let’s turn it up a notchThese trivial examples are not particularly compelling. Unfold is touted as the complement to :inject. So you would expect :unfold to be as useful as :inject. And :inject is very, very useful—you “reduce” lists of things to values all the time.
But how often do you need to turn a value into a list? How often do you need to turn ‘10’ into ‘[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]’? And if you do, what’s wrong with using
(1..10).map(&'**2')?
Remember that :unfold can be applied to objects with a lot more information to them. The thing that had me stuck when I first saw :unfold was thinking of it as the opposite of :inject. Or at least, the opposite of how I used :inject. I tended to use :inject in a way that reduced information. For example:
[7, 6, 10, 3, 9, 4, 8, 5, 2, 1].inject(&'+')
This gives us the sum of the numbers from one to ten, as it happens. It also gives us a value that is considerably simpler than the list we used to generate the number. Information is lost when we use :inject to “reduce” a list to a very simple value. So my first reaction to :unfold was to think of ways to use :unfold on very simple values, like numerics.
But :unfold doesn’t have to work with simple values. It can work with arbitraily complex data structures. Consider:
def zip(*lists)
lists.unfold(
:while => '.first',
:map => '.map(&".first")',
&'_.reject(&".length < 2").map(&"[1..-1]")')
end
Zip is a function that takes two (or more, but let’s just say two for now) lists, and produces a list of pairs of items. So:
zip([:a, :b, :c], [1, 2, 3])
=> [[:a, 1], [:b, 2], [:c, 3]]
How does :unfold do it? First, of course, it makes a single list of lists. It then performs an unfold on this single data structure. The incrementor successively reduces each sublist by removing the first items. So the output of the successive incrementor operations is:
[
[[:a, :b, :c], [1, 2, 3]],
[[b, :c], [2, 3]]
[[:c], [3]]
]
The :map then extracts the first items from each sublist and presents them as a list:
[
[:a, 1],
[:b, 2],
[:c, 3]
]
Neat. But why do we care about zip? Well, if you’ll notice, we already have a bunch of really useful things we can do with lists, like :map, :select, :reject, :detect, and so on. What would you do if you had two lists and needed to do something with each pair in the list, like… A list of first names and surnames that need to be catenated together?
zip(first_names, surnames).map(&'"#{_[0]} #{_[1]}"')
Zip is useful when we have a bunch of parallel lists and there’s something we want to do with each tuple from the lists.
Generalized iterationWe recognize this “pattern,” it’s one of the most powerful in programming. Zip was one algorithm, a way of iterating over several lists simultaneously. The other algorithm was
"#{_[0]} #{_[1]}", a recipe for what to do with the successive tuples of values.


The Ruby Way
is the perfect second Ruby book for serious programmers. The Ruby Way contains more than four hundred examples explaining how to do everything from distribute Ruby with Rinda to functional programming techniques just like these.
The powerful idea was to separate the mechanics of turning a data structure into a linear series of values—iterating—from what we want to actually do with each value. (In OO-style programming, we would define a method for lists of lists that returns an iterator over the tuples of values. Same thing, proving that how you do it is not as important as understanding
why you do it.)
Unfold has other uses, but this one alone is worth the trouble to understand the pattern even if you aren’t rushing to implement this exact unfold method: Converting a single data structure to a list is one way to implement iteration: for any data structure, you can use unfold to define a linear iteration. You can then use :each or :map or :inject just as our parents before us would have used DO or FOR loops.
Consider this (inelegant, but I’m writing this rather late at night) unfold:
[[1, 2, 3], [4, 5, 6, [7, 8]], 9, 10].unfold(
:while => '.first',
:map => lambda { |first|
first = first.first while first.kind_of?(Array)
first
}
) { |state|
state = state.first + state[1..-1] while state.first.kind_of?(Array)
state[1..-1]
}
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
This is the same idea, only we convert a tree into a list representing a depth-first search of a simple tree. You may recognize it as Array’s :flatten method. Once again, it’s really a way of iterating over the elements of a tree. So one way to think of this :unfold is that it is an iterator over a tree’s leaves:
def flatten(arr)
arr.unfold(
:while => '.first',
:map => lambda { |first|
first = first.first while first.kind_of?(Array)
first
}
) { |state|
state = state.first + state[1..-1] while state.first.kind_of?(Array)
state[1..-1]
}
end
But I already know how to write Zip and Flatten methods, honest I do.Zip and Flatten
are relatively common, that’s why :flatten and :zip can both be found in Ruby’s standard Array class. And if there’s a data structure that needs regular unfolding, you ought to weigh the advantages and disadvantages of writing an :unfold for it or using more humdrum ways of writing an iterator.


The Haskell School of Expression
is a terrific and relatively jargon-free introduction to the language that popularized fold, unfold, and all of the other functional programming idioms. As Eric Kidd says, it will make your head explode. Recommended!
However, what do you do when you only need to unfold something once? For example, perhaps you have code that obtains some data in JSON format, and having used a library to parse the JSON into a one-off list or hash, you want to iterate through it.
With unfold, you can write your one-time, specific iterator right in place. This is no different than using blocks and lambdas in Ruby for one-off functions that really don’t need the cermony and weight of being implemented as methods.
When you want to iterate through something, and you want to separate the mechanism for iterating through the data from what you do with the data, :unfold should be in your tool box.
Unfold and the bio-sciences. Not really.I like to think of :unfold like unfolding a protein molecule. When you stare at a data structure, it’s dense, opaque. But you supply an unfold algorithm, and what looked like a messy ball of twine unravels into a long filament made up of simple elements. You can then operate on the simple elements, without getting what you want to do en-snarled in how you iterate over the data structure.
So there you have it. Unfold can be really useful if we see it as a standardized way to write iterators for data structures.
Update: The under-appreciated unfold.Labels: lispy, popular, ruby
Really simple anamorphisms in Ruby
Anamorphisms are functions that map from some object to a more complex structure containing the type of the object. They are the dual—a fancy word for complement—of Catamorphisms, functions that map from some complex structure down to a simpler object.
In simpler terms, a Catamorphism is a function like
inject: In Ruby, Inject takes a collection and produces something simpler. It is also called
fold or
reduce in other languages. Inject can do something like produce the sum of a list:
(1..5).inject &'+' => 15
Unfold does the reverse: it takes a single value and turns it into a collection. Now, a proper unfold can be configured with a seed value, a transformation, a stopping predicate, maybe a distinction between states and output values, and even the type of structure you want to create. A proper unfold would even work with
lazy lists if it didn’t have a stopping condition.
But sometimes you want something really simple. The state and the output value could be the same thing, eliminating a transformation from state to output. The stopping predicate could simple, it could stop when it reaches
nil. And it could always return Arrays. So you could use such a simple unfold whenever you have a seed value and some sort of function (expressed as a block) that returns nil when it has no more values.
For example:
10.unfold { |n| n-1 unless n == 1 }.inspect => [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
10.class.unfold(&:superclass).inspect => [Fixnum, Integer, Numeric, Object]
Hey, what happens if you combine really simple anamorphisms with really simple catamorphisms?
5.unfold(&'_-1 unless _==1').inject(&'*') => 120
Here is the code
unfold.rb. (If
overloading core classes with too much responsibility is not to your taste, it is trivial to express unfold as a function taking a seed and a block as arguments).
Now that I’ve whet your appetite, here’re
Really useful anamorphisms in Ruby. And for another implementation,
everything you ever wanted to know about implementing a true unfold in Ruby.
Enjoy!
(Code courtesy of
Mobile Commons. Thanks!)
Here’s an interesting email from Hugh Sasse:I was reading Code Complete 2
last night and Steve McConnell says that he wouldn’t hire a programmer who wrote a recursive factorial function. [I’m not sure that would be such a crime in languages like Lua with proper tail recursion, but still. (I’m probably wrong, now I’ve written that! “Open mouth, insert foot, echo internationally” as they used to say on Fidonet :-))] I thought “that seems a bit harsh”, especially since in real life they’d get one from a library most of the time, anyway.
So, as I have often run out of stack on Ruby, I’ve tried to rewrite unfold as an iterative function. It seems to be working. The script as modified blows up at 5000 on my system with the recursive version, but the iterative version succeeds at 5000.
Thanks for this blog entry. There’s something REALLY nice about this idea, which I can’t put my finger on. Might be something to do with loops terminating when they should, and The Pragmatic Programmers’ article about “cook until done”
http://www.pragprog.com/articles/cook-until-done
Unfold feels like a “do until finished” loop.
class Object
# As above, but iterative, rather than recursive.
def unfold2 &block
result = [self]
x = block.call(self)
while not x.nil?
result.push x
x = block.call(x)
end
return result
end
end
Labels: lispy, popular, ruby
String#to_proc
Breaking news! The irb enhancement gem Utility Belt includes String#to_procString#to_proc is an addition to Ruby’s core String class to enable
point-free hylomorphisms…
I’ll start again. String#to_proc adds a method to Ruby’s core String class to make lots of mapping and reducing operations more compact and
easier to read by removing boilerplate and focusing on what is to be done. In many cases, the existing black syntax is just fine. But in a few cases, String#to_proc can make an expression
even simpler.
String#to_proc is a port of the String Lambdas from Oliver Steele’s
Functional Javascript library. I have modified the syntax to reflect how String#to_proc works in Ruby.
We’ll start with the examples from String Lambdas so you can see what is actually going on. Then we’ll look at how to use the
& coercion to make working with arrays really simple.
to_proc creates a function from a string that contains a single expression. This function can then be applied to an argument list, either immediately:
'x+1'.to_proc[2];
→ 3
'x+2*y'.to_proc[2, 3];
→ 8
or (more usefully) later:
square = 'x*x'.to_proc;
square(3);
→ 9
square(4);
→ 16
Explicit parametersIf the string contains a
->, this separates the parameters from the body.
'x y -> x+2*y'.to_proc[2, 3];
→ 8
'y x -> x+2*y'.to_proc[2, 3];
→ 7
Otherwise, if the string contains a
_, it’s a unary function and
_ is name of the parameter:
'_+1'.to_proc[2];
→ 3
'_*_'.to_proc[3];
→ 9
Implicit parametersIf the string doesn’t specify explicit parameters, they are implicit.
If the string starts with an operator or relation besides
-, or ends with an operator or relation, then its implicit arguments are placed at the beginning and/or end:
'*2'.to_proc[2];
→ 4
'/2'.to_proc[4];
→ 2
'2/'.to_proc[4];
→ 0.5
'/'.to_proc[2, 4];
→ 0.5
’.’ counts as a right operator:
'.abs'.to_proc[-1];
→ 1
Otherwise, the variables in the string, in order of occurrence, are its parameters.
'x+1'.to_proc[2];
→ 3
'x*x'.to_proc[3];
→ 9
'x + 2*y'.to_proc[1, 2];
→ 5
'y + 2*x'.to_proc[1, 2];
→ 5
ChainingChain
-> to create curried functions.
'x y -> x+y'.to_proc[2, 3];
→ 5
'x -> y -> x+y'.to_proc[2][3];
→ 5
plus_two = 'x -> y -> x+y'.to_proc[2];
plus_two[3]
→ 5
Using String#to_proc in Idiomatic RubyRuby on Rails popularized
Symbol#to_proc, so much so that
it will be part of Ruby 1.9.
If you like:
%w[dsf fgdg fg].map(&:capitalize)
→ ["Dsf", "Fgdg", "Fg"]
then
%w[dsf fgdg fg].map(&'.capitalize') isn’t much of an improvement.
But what about doubling every value in a list:
(1..5).map &'*2'
→ [2, 4, 6, 8, 10]
Or folding a list:
(1..5).inject &'+'
→ 15
Or having fun with factorial:
factorial = "(1.._).inject &'*'".to_proc
factorial[5]
→ 120
String#to_proc, in combination with
& coercing a value into a proc, lets you write compact maps, injections, selections, detections (and many others!) when you only need a simple expression.
Caveats: String#to_proc uses
eval. Cue the chorus of people—pounding away on quad 3Ghz systems—complaining about the performance. You’re an adult. Decide for yourself whether this is an issue. After
mankying things about to deduce the parameters, String#to_proc evaluates its expression in a different binding than where you wrote the String. This matters if you include free variables. My thinking is that it ceases to be a simple, easy-to-understand hack and becomes a cyrptic nightmare once you get too fancy.
You know that Voight-Kampff test of yours… did you ever take that test yourself?
—Rachael, Blade Runner
I have been using
Functional Javascript for quite some time now, and I use the String Lambdas a lot. However, Ruby and Javascript are very different languages. Once you get out of the browser’s DOM, Javascript is a lot cleaner and more elegant than Ruby. For example, you don’t need to memorize the difference between a block, a lambda, and a proc. Javascript just has functions.
However, Javascript is more verbose: Whereas in Ruby you can write
[1, 2, 3].map { |x| x*2 }, if Javascript had a
map method for arrays, you would still have to write
[1, 2, 3].map(function (x) { return x*2; }). So it’s a big win to make Javascript less verbose: code is easier to read at a glance when you don’t have to wade through jillions of function keywords.
Nevertheless, I still find myself itching for the String Lambdas when I’m writing Ruby code. It may be a matter of questionable taste, but for certain extremely simple expressions, I vastly prefer the point-free style.
(-3..3).map &:abs is shorter than
(-3..3).map { |x| x.abs }.
It is also cleaner to me.
abs is a
message, especially in a language like Ruby that supports the sending arbitrary messages named by symbols. Writing
(-3..3).map &:abs looks very much like sending the
abs message to everything in the list. I don’t need an
x in there to tell me that.
Thus, I obviously like
(-3..3).map &'.abs'. But I like
(1..5).map &'*2' for the same reason. It isn’t just shorter, it hides a temporary variable that really doesn’t mean Jack to me when I’m reading the code. And quite honestly,
(1..10).inject { |acc, mem| acc + mem } raises more questions than it answers about what
inject does and how it does it.
(1..10).inject &'+' gets right down to business for me. I’d prefer that it be called “fold,” but the raw, naked
+ seems to describe what I want
done instead of how I want the computer to do it.
Symbol#to_proc also supports named parameters, either through implication (
&'x+y') or with the arrow (
'x y -> x*y'). I haven’t thought of a case where that would be a win over using a Ruby block:
{ |x, y| x*y }.
I’m divided about the underscore notation. It seems like a good compromise for expressions where there is a single parameter and it doesn’t fall on the left or the right side of an expression. Standardizing on an unusual variable name is, I think, a win. Underscore often means a “hole” in an expression or a computation, so it feels like a good fit. I would honestly much rather see something like:
&'(1/_)+1' than
&'(1/x)+1'. The underscore jumps out in an obvious way, and it wouldn’t be magically clearer to write
{ |x| (1/x)+1 }.
That being said, I haven’t actually written an underscore expression yet in actual code, so far I’m getting by using the point-free expressions to simplify things and using Ruby blocks for everything else.
RSpec
describe "String to Proc" do
before(:all) do
@one2five = 1..5
end
it "should handle simple arrow notation" do
@one2five.map(&'x -> x + 1').should eql(@one2five.map { |x| x + 1 })
@one2five.map(&'x -> x*x').should eql(@one2five.map { |x| x*x })
@one2five.inject(&'x y -> x*y').should eql(@one2five.inject { |x,y| x*y })
'x y -> x**y'.to_proc()[2,3].should eql(lambda { |x,y| x**y }[2,3])
'y x -> x**y'.to_proc()[2,3].should eql(lambda { |y,x| x**y }[2,3])
end
it "should handle chained arrows" do
'x -> y -> x**y'.to_proc()[2][3].should eql(lambda { |x| lambda { |y| x**y } }[2][3])
'x -> y z -> y**(z-x)'.to_proc()[1][2,3].should eql(lambda { |x| lambda { |y,z| y**(z-x) } }[1][2,3])
end
it "should handle the default parameter" do
@one2five.map(&'2**_/2').should eql(@one2five.map { |x| 2**x/2 })
@one2five.select(&'_%2==0').should eql(@one2five.select { |x| x%2==0 })
end
it "should handle point-free notation" do
@one2five.inject(&'*').should eql(@one2five.inject { |mem, var| mem * var })
@one2five.select(&'>2').should eql(@one2five.select { |x| x>2 })
@one2five.select(&'2<').should eql(@one2five.select { |x| 2<x })
@one2five.map(&'2*').should eql(@one2five.map { |x| 2*x })
(-3..3).map(&'.abs').should eql((-3..3).map { |x| x.abs })
end
it "should handle implied parameters as best it can" do
@one2five.inject(&'x*y').should eql(@one2five.inject(&'*'))
'x**y'.to_proc()[2,3].should eql(8)
'y**x'.to_proc()[2,3].should eql(8)
end
end
Go ahead, download the
source code for yourself.
Update: Reg smacks himself in the head!I had a look at the source code for
Symbol#to_proc:
class Symbol
# Turns the symbol into a simple proc, which is especially useful for enumerations. Examples:
#
# # The same as people.collect { |p| p.name }
# people.collect(&:name)
#
# # The same as people.select { |p| p.manager? }.collect { |p| p.salary }
# people.select(&:manager?).collect(&:salary)
def to_proc
Proc.new { |*args| args.shift.__send__(self, *args) }
end
end
Look at that: Although the examples are all of unary messages like
.name, the lambdas created handle methods with arguments. And since almost everything in Ruby is a method, including operators like
+… You can use Symbol#to_proc to do some of the point-free stuff I like:
[1, 2, 3, 4, 5].inject(&:+)
→ 15
[{ :foo => 1 }, { :bar => 2 }, { :blitz => 3 }].inject &:merge
→ {:foo=>1, :bar=>2, :blitz=>3}
Labels: lispy, popular, ruby
map(&"(1.._).inject(&'*')")
Some working Ruby code:
(1..3).map(&'*2') => [2, 4, 6]
(1..3).map(&'[-1, _, 0]') => [[-1, 1, 0], [-1, 2, 0], [-1, 3, 0]]
(1..3).map(&'x -> y -> x * y').map(&'[2]') => [2, 4, 6]
(1..5).select(&'>2') => [3, 4, 5]
(1..3).map(&'x -> y -> x * y').map(&'.call(2)') => [2, 4, 6]
[5].map(&"(1.._).inject(&'*')") => [120]
See
String#_to_proc.
Labels: lispy, ruby
Too much of a good thing: not all functions should be object methods
OOP is several different ideas put together, the most important of which is
Fine-Grained Information Hiding.
One can think of information hiding as being the principle and encapsulation being the technique. A software module hides information by encapsulating the information into a module or other construct which presents an interface.
The basic principle of all OO languages is that relatively small things—such as individual accounts in a business program—each encapsulate both their data (in the form of members) and their algorithms (in the form of methods). Our notions of members and polymorphism both work to this goal of hiding information. There’s a lot more to most OO languages, such as whether they include a notion of types and what mechanisms they use for sharing common behaviour. But let’s look at this one principle:
objects are responsible for their data and for their algorithms.
should objects be responsible for all of their own behaviour?There’s a general idea that in a well-constructed program, each object “knows” how it ought to behave. That’s what its methods are for. Quite obviously, objects cannot be responsible for
everything involving them in a program. If each object completely encapsulated all of the things it could do or be involved in, you would never pass one object as a parameter in a message to another object.
For every complex problem, there is a solution that is simple, neat, and wrong.
—H. L. Mencken
For example, you would never have collections. If every object “knew” how to organize itself into collections, you wouldn’t need an Array or Hash, would you? In practice, each object in a system can be involved in many different actions. It has to be responsible for some of them, and it has to play a secondary, passive role in others. Most OO programs do not have every object implement its own collections methods. They may include some form of specialization so you can have an array of accounts, but an array of accounts is still not an account.
subject.verb(object)In the English language, we have the idea of a
Subject and an
Object in a sentence. For example, when we say “Jack loves Jill,” Jack is a subject and Jill is an object. Jack loves. Jill is loved. It’s the same in OO programs. Sometimes objects are actively doing things through their methods. Sometimes other object’s methods are doing things with them.
Verbing Weirds Language
Good OO design is, in part, doing a good job of choosing the right bifurcations: given a list of nouns and verbs, making the right decisions about which nouns ought to be the active nouns, the subjects, the ones that “own” the verb in the form of a method. And thus consciously making decisions about which objects ought to be the passive nouns, the objects of the verbs, the ones that
don’t implement the methods.
Unfortunately, there are lots of places where we can err on the side of giving too much responsibility to individual objects. It’s understandable, given that OO is theoretically all about objects being responsible for themselves. But as in many other things, in practice good OO is about objects being responsible for a little as possible (but no less!), not as much as possible.
the kingdom of nounsOne common symptom of this problem is
a system that has objects for all of the obvious nouns or entities, but not for the verbs. OO began with languages like
Simula, where the paradigm was trying to represent real-world entities such as automobiles on a highway. From that time forward, the emphasis has been on having objects for each noun in the problem domain. In such traditionally-organized OO programs, the “verbs” or actions are all attached to objects as methods.


Object Design: Roles, Responsibilities, and Collaborations
focuses on the practice of designing objects as integral members of a community where each object has specific roles and responsibilities. The authors present the latest practices and techniques of Responsibility-Driven Design and show how you can apply them as you develop modern object-based applications.
Not all “verbs” have a clear separation between a single entity that is the subject or active entity that ought to own the verb’s definition and the secondary, passive subject entities that should not own the verb’s definition. The easiest examples of this are operations that are intended to be
commutative.
For example, many languages define addition as a method belonging to numbers or magnitudes. In Smalltalk, the expression
1 + 2 actually means “
send the message + 2 to the object 1.” At first glance, this seems elegant: the number
1 handles the message
+ 2 as integer addition, while
1.0 would handle the same message with floating point arithmetic. What more could you want?
Well, there is a huge problem with this arrangement:
Addition is commutative.
1.0 + 2 must give the same result as
2 + 1.0. Using a simple message to implement addition means that you must be excruciatingly careful to handle all of the possible cases so that you do not accidentally violate this property. Now of course, the designers of system classes like
Integer and
Float went to this trouble. But if you want to add another magnitude class—say
CurrencytwoPlaceDecimal—you have to open up all of the system classes and modify them so that
1 + ThirtyCents gives the same result as
ThirtyCents + 1.
beware of breaking symmetryOf course, you may not need to implement a new magnitude class. Fine. But what about
symmetric relations like
comparison? This is a major pitfall for OO developers: in many cases you need to write a test of equivalence or equality (operations like
==,
equal?,
eql?,
eqv? and all of the other variations on the same theme). In every one of these cases, horrible things will happen if your operation is not symmetric. For every case,
x.eql?(y) if-and-only-if
y.eql?(x).
This is obviously easy when
x and
y are both the same kind of object.
What happens when they’re different, but still logically equivalent? It turns out that implementing commutative operations and symmetric relations as methods doesn’t work very well. It forces you to smear duplicate logic over many different classes (or prototypes, if your language swings that way).
Here’s a practical example. Let’s say you want to implement a form of equivalence for collections. For ordered collections like lists, what you want is that if two ordered collections have the same members, in the same order, they are equivalent. It’s easy to imagine writing such a method as a
mixin for all of your ordered collections. It obviously knows about iterating over ordered collections (recursively, if you grew up with
Godel, Escher, Bach on your night stand). Note that you may not have an indexed collection: you might have a
list where you simply retrieve values in order.
And likewise, you can write a collection equivalence method for dictionaries like hash tables: if two objects have the same values at the same keys, they are equivalent. Again, a simple mixin will handle things for dictionaries.
Now comes the wrinkle: you decide that an ordered collection ought to be equivalent to a dictionary where the keys are the integers ascending from zero. In other words,
('foo' 'bar' 'blitz') ought to be equivalent to
{ 0 => 'foo', 1 => 'bar', 2 => 'blitz' }. How are you going to code this? Well, the dictionary mixin could obviously handle equivalence to an ordered list. But we need symmetry, so we have to “open up” the ordered collection mixin and add code for equivalence to dictionaries.
Actually I made up the term object-oriented and I can tell you I did not have C++ in mind. The important thing here is that I have many of the same feelings about Smalltalk.
—Alan Kay
I’m holding my nose, we have not one but two different code smells: 1) Why is one piece of logic in two different places? 2) Why do ordered collections know anything at all about dictionaries, and why do dictionaries know anything at all about ordered collections? The latter is especially disturbing: the whole point of OO is information hiding. How does having ordered collections and dictionaries knowing about each other help us to hide information?
The obvious answer to me is that the knowledge of how to compare an ordered collection to a dictionary does not belong in ordered collections or in dictionaries. The requirement that relations like equivalence be symmetrical across heterogeneous types implies that the types themselves cannot be responsible for implementing equivalence for themselves.
There are similar problems of code duplication and information leakage apply to modelling relations (why do we declare
has_one and
belongs_to in Rails) and implementing the
<=> operator in Ruby. It looks like having verbs “belong to” the subject noun is often a good idea, but not always a good idea.
commuting the sentence of executionMaybe some verbs belong to objects, but some are best on their own? Maybe
+ and
<=> and
equivalent? really ought to be emancipated from their subservience to objects and ought to have their own definitions.
There are two real approaches to object-orientation. The first is known as message-passing. You send an object a message and ask it to deal with it. (This would not work with many people in this newsgroup.) The meaning of the message is local to the object, which inherits it from the class of which it is an instance, which may inherit it from superclasses of that class…
The second approach is generic functions. A generic function has one definition of its semantics, its argument list, and is only specialized on particular types of arguments.
What we ought to do is take some of the verbs and give them their own place in our programs, instead of hanging them off nouns. This isn’t such a revolutionary idea: Common Lisp’s
Metaobject Protocol does this exact thing, providing
generic functions. A generic function is, in effect, a verb raised to the same level of abstraction as a noun.
This isn’t some revolutionary idea limited to “powerful” languages either: the Java collections framework uses a
Comparable interface for ordering collections. The
compareTo(...) method belongs to an object. By way of—ahem—comparison, the
Comparator interface extracts comparison out of the subject object and puts it in a separate function object. You can perform sorts in Java either way.
If we aren’t using Common Lisp, can we build the verbs we want out of the tools at our disposal? In other words, can we
Greenspun generic functions in languages like Java and Ruby?
generic functions in java, plus a detailed look at method dispatchingLet’s start by thinking about generic functions in a Java-like language.
Returning to our example of writing
equivalent?, we might make an
Equivalent class with a single method, perhaps we can call it
eval. So we end up with something like
Eqivalent.eval(foo, bar). Java-like languages allow us to write different versions of the
eval method with different type signatures, so we can write:
public static boolean eval (List foo, List bar) { ... }
public static boolean eval (List foo, Map bar) { ... }
public static boolean eval (Map foo, List bar) { return eval(bar, foo); }
public static boolean eval (Map foo, Map bar) { ... }
And so forth. Are we done?
No, our code is broken. What happens when we decide that the “default” equivalence is the
== relationship. We can’t write:
public static boolean eval (Object foo, Object bar) { return foo == bar; }
This is hideously broken in languages like Java. You’re almost all nodding in agreement, but please be patient while I explain it anyway: you probably want to pass this along to someone who really needs to be told why it is broken, so why don’t I go ahead and explain it for
them?
What you want is that if two objects are of the more specific types—
List and
Map—we will call the more specific version of the
eval methods. But if we can’t “match” one of the more specific
eval methods, we want to use
eval (Object foo, Object bar). Too bad, that’s not how Java works.
Java uses two completely different ways to figure out which method to call when you overload methods!Way number one is is for figuring out that when you call
noun.verb(...), where do we find the definition for
verb? This lookup is effectively done at run time, so that even if your code looks like this:
public static void printSomething(Object foo) {
System.out.println(foo.toString());
}
Java will look up the method
toString based on
foo’s actual type when the method is called, even though you declared it to be an
Object. That’s polymorphism at work, and it’s the information hiding working for us. Each object can do it’s own thing where
toString is concerned, and we don’t have to worry about it. This is called
single dispatch, because it figures out which method to call based on just one of the nouns, the subject noun a/k/a the receiver of the method invocation.
But that’s not what happens when we write this:
public static void printSomethingElse (Object foo, Object bar) {
if (Equivalent.eval(foo, bar))
System.out.println("2 x " + foo);
else System.out.println(foo.toString() + ", " + bar.toString());
}
It will
always call
eval (Object foo, Object bar). It will
not call
eval (List foo, List bar) if you pass it two lists. That’s because although each of our methods have the same name—
eval—Java treats them as different methods, and it figures out which one to call based on the declared types of the parameters at compile time, not on the actual types of the parameters’ values at run time.
Besides writing a Lisp interpreter in Java, your next best bet for building a generic function the way we want it is to find a way to turn Java’s single dispatch into a multi-dispatch, to dispatch on two nouns,
foo and
bar.
The good news is this: dispatching at run time on two different types is a well-known problem, and the solution is called
double dispatch. The problem with double dispatch is that it moves our equivalence code back into our nouns, and we don’t want that.
The
Visitor pattern might be handy: it’s a way to add methods to an object at run time in a language like Java that supposedly doesn’t do that. If we decide that everything to be compared using
Equivalent.evalimplements an interface called
Visitable, we can build a double dispatch system that doesn’t require putting an
equivalent? method in the entities being compared:
interface Visitable {
Object accept(final Visitor visitor);
}
interface Visitor {
Object visit(final Object obj);
Object visit(final List list);
Object visit(final Map map);
}
public class Equivalent {
static boolean list_list (List foo, List bar) { ... }
static boolean list_map (List foo, Map bar) { ... }
static boolean map_map (Map foo, Map bar) { ... }
static boolean object_object (Object foo, Object bar) { ... }
public static boolean eval (final Visitable foo, final Visitable bar) {
return foo.accept(
bar.accept(
new Visitor () {
public Object visit(final Object bar) {
return new Visitor () {
public Object visit(final Object foo) {
return object_object(foo, bar);
}
public Object visit(final List foo) {
return object_object(foo, bar);
}
public Object visit(final Map foo) {
return object_object(foo, bar);
}
}
}
public Object visit(final List bar) {
return new Visitor () {
public Object visit(final Object foo) {
return object_object(foo, bar);
}
public Object visit(final List foo) {
return list_list(foo, bar);
}
public Object visit(final Map foo) {
return list_map(bar, foo);
}
}
}
public Object visit(final Map bar) {
return new Visitor () {
public Object visit(final Object foo) {
return object_object(foo, bar);
}
public Object visit(final List foo) {
return list_map(foo, bar);
}
public Object visit(final Map foo) {
return map_map(foo, bar);
}
}
}
}
)
)
}
}
If that looks like a lot of work to you, I agree. You’re basically replicating Java’s run time dispatching on two types, so you need a bit of a matrix. Is it worth the effort? Let’s consider what this wins you:
- Your entities or objects no longer need to know all about other types of entities;
- It’s easier to make sure that commutation and symmetry are preserved when the code for a relationship is in its own class and not smeared over multiple entities.
And best of all, you have a nice place for your verbs, and they are no longer second-class citizens behind the nouns.
Update: A few people have suggested alternate approaches to implementing multiple dispatch in Java. I think there are various trade-offs to be made, and several different implementations ought to be considered before you write production code.
However, the point of the article is to suggest that not all functions should be implemented as methods of subject objects. I think it makes that point regardless of what you think of using a Visitor and a double dispatch.
Here’s an alternate approach from
Laurie Cheers:
interface Classifiable
{
int classify();
}
// I know this isn't valid Java, but it makes the example much clearer. The alternative is to tiresomely spell out every combination.
#define PAIR(a,b) (a|(b<<4))
abstract class DoubleDispatchable
{
abstract Object list_list(List a, List b);
abstract Object list_map(List a, Map b);
abstract Object map_map(Map a, Map b);
abstract Object object_object(Object a, Object b);
const int OBJECT = 0;
const int LIST = 1;
const int MAP = 2;
Object dispatch(Classifiable a, Classifiable b)
{
switch(PAIR(a.classify(), b.classify))
{
case PAIR(LIST, MAP): return list_map(a,b);
case PAIR(MAP, LIST): return list_map(b,a);
case PAIR(LIST, LIST): return list_list(a,b);
case PAIR(MAP, MAP): return map_map(a,b);
default: return object_object(a,b);
}
}
}
class Equivalent extends DoubleDispatchable
{
Object list_list(List a, List b) {...}
Object list_map(List a, Map b) {...}
Object map_map(Map a, Map b) {...}
Object object_object(Object a, Object b) {...}
bool eval(Classifiable a, Classifiable b) { return dispatch(a,b) != false; }
}
What trade-offs, you ask? The Visitor pattern given gets the compiler to guarantee that you write each of the nine cases, whereas hand-written tests and logic simplifies the code.
I specifically chose the Visitor pattern because it seemed more in keeping with the spirit of the Java language and culture, trading verbosity for compiler safety.
I'm extremely comfortable with the other trade-off, emphasizing readability and simplicity. Although, if you go far enough down that road, you might as well look at other languages ;-)
Labels: java, lispy, popular
Ruminations about the performance of anonymous functions in naive Javascript implementations
Block-Structured Javascript (better known as the
Module Idiom) looks like this:
(function () {
var something_or_other;
// code elided
return something_or_other;
})()
This creates a new, anonymous function with its own local scope. Whenever this code is execututed, the interpreter creates a function record in its memory. The exact same thing happens if you create a function and bind it to a variable with
var foo = function (...) { ... };.


The Seasoned Schemer
is devoted to the myriad uses of first class functions. Luckily for us, the ideas in this provocative book map directly to Javascript (see the plug for Lisp in Small Pieces below).
When you close the back cover you will be able to compose programs from functions in powerful new ways, and you can use these new techniques in Scheme, Ruby, and Javascript immediately.
Now let’s consider another common pattern, the
Inner Function: we have a function, and the function needs a helper function. We define the helper function inside our function to make our code more
encapsulated:
var factorial = function (n) {
var factorial_acc = function (acc, m) {
if (0 == m) {
return acc;
} else {
return factorial_acc(m * acc, m - 1);
}
};
return factorial_acc(1, n);
};
What happens when we invoke
factorial it six times?
When the interpreter first encounters the code defining
factorial, it creates a function and assigns it to the variable
factorial. Then each time we invoke the
factorial function, the interpreter creates a new function record for
factorial_acc. So in total, the interpreter creates
seven functions in memory, not two.
hand-rollingIf this code needed hand optimization, you might want to consider ‘lifting’ the definition of
factorial_acc outside of factorial, so it doesn’t get recreated with every invocation:
var factorial_acc = function (acc, m) {
if (0 == m) {
return acc;
} else {
return factorial_acc(m * acc, m - 1);
}
};
var factorial = function (n) {
return factorial_acc(1, n);
};
This produces exactly the same result as our Inner Function version.
factorial_acc doesn’t use any of
factorial’s parameters or variables, so it does not really need to be inside its scope to produce the correct result.
Now you only need two function records, not seven. Two is cheaper than seven. The problem with this approach is that you are proliferating names. If you are binding functions to names in the global environment, it quickly becomes crowded. And you also have a readability issue. Does anything else need to use
factorial_acc? The original code made it very obvious that
factorial_acc is only ever used by
factorial.
A block can help. Yes, the cause of our performance consideration—dynamically creating functions—can actually be part of the solution:
var factorial = (function () {
var factorial_acc = function (acc, m) {
if (0 == m) {
return acc;
} else {
return factorial_acc(m * acc, m - 1);
}
};
return function (n) {
return factorial_acc(1, n);
}
})();
Now what happens? Well, we create an anonymous function for our block. One function record. Within that block’s execution, we create two more functions, one assigned to the variable
factorial_acc, and one returned from the block (and then assigned to the variable
factorial). This code creates three function records, which is still much better than seven.
As a correspondent summarized in email, “we’ve shown how to replace a simple function containing an inner function with a block call that returns a closure referencing the inner function so as to avoid re-defining it on each call. That’s all there is to it.”
(By the way,
Douglas Crockford has done a very good job of explaining this idiom in Javascript, and named it the
Module Pattern. Here’s
a discussion with particular emphasis on OO-style programming. And here’s
a really detailed examination from the YUI team.)
So should you always rewrite inner functions to use a block like this?
I don’t personally fool around with this kind of hand optimization willy-nilly (Of course, you may find the block version more readable than the inner function version. If you do, it’s a win to write it that way). It has a cost: in a more complex function, defining helpers outside of the function may be moving them further away from where they are used, which is a loss for readibility. If you prefer the inner function version, you should be very sure you have a performance problem before you leap to the conclusion that you should rewrite it.
a heuristic for automatic optimization of inner functions and blocksLisp implementations have been optimizing this kind of code, automatically, for decades. That’s because Lisp programmers have been writing programs in this style for decades, either directly or using macros like
let. Here’s the basic heuristic:


Lisp in Small Pieces
is one of the most important books about Javascript ever written. WTF!? may be your first thought. Hold on. Javascript at its heart, a very Lisp-like language with C syntax. So understanding Lisp helps you understand Javascript.
What makes Lisp in Small Pieces
special for Javascript programmers is that it illustrates the principles underlying Lisp (and therefore Javascript) by creating a series of implementations, each of which illustrates the basic mechanisms in the language.
These deep ideas are exactly the things that make Javascript different from other C-syntax languages like Java or Visual Basic. This book, more than any other, will take your understanding from knowing what works on the surface to understanding why and how it works.
- Find every
function invocation that is nested inside another function.
- Analyze all of its variable references lexically. If there are no references to a parameter or local variable of its immediate parent,1 promote the function by defining it in its grand-parent and assigning the definition to a new variable in the grand-parent.
- Replace the original definition in the parent with a reference to the variable you just created in the grand-parent.
- Repeat until all functions are either global or refer to variables in their immediate parent functions.
There’s also a well-know optimization for making blocks themselves free or nearly free:
lambda lifting. So before optimizing things prematurely, test your implementation and see if it is already fast enough for your purposes.
You may discover that you don’t save anything by rewriting things yourself. (You may make your code slower: some optimizations rely on knowing the exact scope of the code being optimized. If you proliferate names by lifting things yourself, the optimizer may not be able to use all of its tricks.)
These techniques have been known for
twenty-five years. If a
Javascript implementation that you are forced to target doesn’t include it, why not demand that the implementers get with the program and, you know, use some of the stuff we’ve known about programming for almost as long as they’ve been alive? Especially if they brag about their prowess at creating programming languages?
conclusion: nice-to-know, but not essentialMy personal conclusion is that the behaviour of a naïve implementation is a “nice-to-know.” I don’t personally worry about optimizing it until I have a known performance issue, at which point it is essential to test to see whether some of the hand-optimizations will actually help.
YMMV.
1. Full closures make things tricky:
Functions that refer to variables in their immediate parent scope are much trickier to optimize away. Sometimes, such a function is supposed to be created anew for each invocation of its parent. For example, if you want to construct a bank balance thingy without using objects, you might write:
function (balance) {
return function (amount) {
balance = balance + amount;
return balance;
};
};
You pass in an initial balance, and it gives you a single function that you can use to deposit (pass a positive number), withdraw (pass a negative number), or check the balance (pass zero). It returns the updated balance in each case.
The inner anonymous function cannot be lifted or optimized away because of its reference to
balance in its parent and because the function can be shown to “escape” its parent.
Whereas in this contrived example:
function (balance, owner) {
return function (amount) {
return {
new_balance: (function () {
if (balance + amount >= 0)
return balance + amount;
else return balance;
})(),
account_owner: owner
}
};
};
Although the inner function still cannot be optimized away, the block within it can be lifted into the inner function and removed, producing:
function (balance, owner) {
return function (amount) {
var __temp;
if (balance + amount >= 0)
__temp = balance + amount;
else __temp = balance;
return {
new_balance: __temp,
account_owner: owner
}
};
};
Labels: lispy, popular
Block-Structured Javascript
Javascript provides
closures with first-class access to variables declared in the enclosing environment. Besides being a handy piece of trivia if you are ever playing Programming Jeopardy, what use is this to the actual working programmer?
There are a lot of ways to take advantage of Javascript’s closures, I am going to describe just one, replicating Algol’s block structure (or Lisp’s
let and
begin macros, if you prefer). When we’re all done you’ll have a handy tool for making your code more readable, separating its concerns, and generally making life easier for programmers who have to read what you’ve written.
From Bricks to BlocksA block is a chunk of code inside a function. Blocks have well-defined entry and exit points. Blocks have their own local variables and functions, and they also have first-class access to variables and functions defined around them. Blocks may nest.
Structuring code into blocks makes large functions more readable and easier to refactor. All of the variables and logic needed for one thing are encapsulated together in the blocks where they are needed, not scattered about in functions everywhere.
when a better design—if unfamiliar—is shown to developers or experienced users, they tend to reject it. Often, it takes careful explanation and having them gain experience with it before the improvement is understood.
—Jef Raskin
While the idiom may be slightly unfamiliar to some, I think you’ll agree that it is highly accessible and not some sort of über-contruct of interest only to the self-professed hacker. And when a programmer encounters it in your code, she will have no trouble figuring out what it does and how it does it.
block structured codeI think everyone can agree that structured programming is a good thing. Functions should be composed of blocks, with the blocks linked together by constructs such as
if statements:
if (some_condition) {
// a block
}
else {
// another block
}
Back before structured programming was introduced, there were
gotos everywhere. Looking at a piece of code with labels, it was hard to know what the flow of control might be at run time. In short, you never had the confidence that everything you needed to know about that code was right there next to that code.
Like almost all modern languages, Javascript’s blocks do structure the flow of control so that you have nice clean entries and exits from each block. And also like most other modern languages, Javascript does nothing to structure access to variables inside blocks. For example:
var j = 0;
if (some_condition) {
// a block with some code elided
}
alert(j);
What will be shown? You can’t guarantee that zero will pop up, because the blocks might modify
j, like this:
var j = 0;
if (true) {
var j = 42;
}
alert(j);
This looks like a programmer error. After all, the
var keyword should be used to declare a new variable. The code between the braces isn’t really a new block of code with its own variables. This is hardly a problem with trivial examples. But if you start building some larger functions, the possibility of accidentally overwriting some code looms larger. Especially once you start refactoring and moving blocks of code around.
What can we do? How about this: the “block” idiom (you can also call it
let or
begin if you want to sound like a Schemer). Here’s the code:
var j = 0;
if (true)
(function () {
var j = 42;
})();
alert(j);
There’s a lot of syntactic noise, what does it mean? In short, we said: “Create a new function. In the body of the new function define a variable called
j and assign 42 to it. Then call that function without any parameters.” Because our new instance of
j is inside a function, it is not the same variable as the
j outside of the function. That can be handy.
Are there any other benefits of this idiom? Yes indeed. Sometimes you have an assignment and you need some logic on the right hand side. How do you write:
var proven = {
var n = Math.round(100*Math.random());
var total = 0;
for (var i = 0; i <= n; ++i) {
total = total + (2*i) + 1
}
return total == ((n + 1) * (n+1));
};
alert(proven);
You can’t, of course. There are two problems with trying to use braces in this case. First, Javascript only allows braces to form code blocks in conjunction with specific keywords like
if and
function. Second, Javascript code blocks are not
expressions—they do not produce values. This is why languages like Javascript need an if statement
and a ternary operator: if blocks produced values, you would only need if expressions.
So in traditional Javascript style, you have to define a function somewhere else and call it… You’ll notice our block idiom includes defining and calling a function. What if our function returns a value? In that case, we can use a block anywhere we want a value, for example:
var proven = (function () {
var n = Math.round(100*Math.random());
var total = 0;
for (var i = 0; i <= n; ++i) {
total = total + (2*i) + 1
}
return total == ((n + 1) * (n+1));
})();
alert(proven);
This new idiom allows us to make first-class blocks anywhere we like. Our blocks are
expressions, and we can use them anywhere we need a value. And as above, Our variables are fully encapsulated, they do not overwrite variables defined elsewhere.
blocks vs. named functionsYou may be wondering, "Why can’t we use a named function?" This is the style in languages like Python, where the Benevolent Dictator does not permit constructions like this. Here is the above code using a named function:
var proven_helper = function () {
var n = Math.round(100*Math.random());
var n_plus_1_squared = function (n) {
return (n + 1) * (n+1);
};
var sum = function (n) {
var total = 0;
for (var i = 0; i <= n; ++i) {
total = total + (2*i) + 1
}
return total;
};
return sum(n) == n_plus_1_squared(n);
};
var proven = proven_helper();
alert(proven);
I find this almost as good as the block. Since you only use it in one place, it is defined where you use it. That is good. And the name might be helpful documentation, just like a one or two-word comment. Balanced against this is the fact that you have added a new function to the outer scope. Reading it later, you might have to scan the rest of the code to see if it is used elsewhere.
There's also a very small advantage of the block over the named function: since you need two statements (one to name a function, another to use it), you can only use a named function in normal code blocks. You cannot use a named function when you need an expression, unless you resign yourself to naming the function in one place and using it somewhere else.
For example, when constructing array or hash literals, you can use expressions. A block is an expression, while two statements (one to create a named function and one to call it) are not an expression. So a named function would need to be defined outside of an array or hash literal, while a block can be used inside it, placing the code closer to where it is used.
block structure and cleaner code


JavaScript: The Definitive Guide
takes the time to actually discuss the language, to explain what Javascript can do and how to do it. And of course, the book also provides an in-depth reference of every function and object you are likely to encounter in most implementations. Recommended.
Structuring code into blocks makes large functions more readable and easier to refactor. All of the variables and logic needed for one thing are encapsulated together in the blocks where they are needed, not scattered about in functions everywhere. If you see a variable declared inside a block, you know it is only used inside the block. If you see a variable with the same name outside the block—a regrettable occurrence—you know that moving or changing the block will not affect the code working with variables outside of the block.
You probably know that you can put a function inside of a function in Javascript:
var factorial = function (n) {
var factorial_acc = function (acc, n) {
if (0 == n) {
return acc;
} else {
return factorial_acc(n * acc, n - 1);
}
};
return factorial_acc(1, n);
}
alert(factorial(6));
And this is a good thing, it keeps the function
factorial_acc inside of
factorial. Since that’s the only place you need it, why declare it anywhere else? The fact that you can put a function inside of a function implies that you can put a function inside of a block as well:
var proven = (function () {
var n = Math.round(100*Math.random());
var n_plus_1_squared = function (n) {
return (n + 1) * (n+1);
};
var sum = function (n) {
var total = 0;
for (var i = 0; i <= n; ++i) {
total = total + (2*i) + 1
}
return total;
};
return sum(n) == n_plus_1_squared(n);
})();
alert(proven);
If you only need the functions
n_plus_1_squared and
sum to do this one job, in this one place, why should they be defined at top level cluttering up your code? Why force other programmers to search through your code figuring out where they are used before making changes?
Block structure may seem
unfamilar at first, but give blocks a try and see whether you start finding the code even easier to read and refactor with blocks. Like me, you will find that structuring your code with blocks puts the things you use right where you use them.
update:
Ruminations about the performance of anonymous functions in naive Javascript implementations.
Labels: lispy, popular
How to Run Javascript on the JVM in Just Fifteen Minutes
what and whyThis post is about executing Javascript inside the JVM without using a browser. Besides the fact that people are talking about
running Javascript on the server (
again, and
again), here’s why my colleagues and I used it on a recent project:
We have some logic that needs to run on the server and on the client, depending on when the application applies it. There is like an incredibly complex form validation involed. Think of a loan application, for example. Zillions of rules like “at least five years at current location
or at most three locations in ten years
or owns current location for at least one year.” The whole thing forms a big logical expression that needs to be evaluated in such a way that we can report which pieces are missing or do not meet requirements (
Declined because income is insufficient and does not state purpose of loan).
There are a couple of ways to handle this. One is to submit the form back to the server for validation. Another is to write everything in Java, but use
a sophisticated tool to render the Java into Javascript. Naturally, our team chose a third option, The Rails Way (available for
pre-order
).
We have a Domain-Specific Language for describing the rules. Business users use the DSL, and another tool writes code from that. We could, in theory, write Java methods for the server
and write Javascript for the client. We chose to start with Javascript, and we’ll write Java for the server if running Javascript on the server turns out to be
unperformant slow.
In the mean time, we decided that having some Java make a simple function call to a Javascript function and process a simple result was a reasonable first step. As a side benefit, we run all our server-side Javascript unit tests in Java test suites alongside our Java unit tests.
And after some fiddling around, we got Javascript working on the JVM. My bet is that you can get it working too, and it won’t take more than fifteen minutes.
Care to try it?
step zero: the Java Virtual Machine (JVM)You’ll need JDK 1.5 or 1.6 from Sun. If you already have this, move on to step one. Still reading? You’ll need to do a big install before we go further.
Go to the
downloads page and download the latest thing they have on offer with the words “JDK” in it. You won’t need JEE (the framework formerly known as J2EE) for this exercise, but if you know what it is you know enough to decide whether to download it.
Right now, you want
JDK 6u2. Go get it and suffer through the installation process.
Step one: Bean Scripting FrameworkJava6 has a new framework for running “scripting” languages, and it’s built into Java6. We’re not going to use it today, just because some of you may still need to make stuff work with JDK 1.5 in production. Instead, we’re going to go get the Jakarta Bean Scripting Framework (BSF).
You can download it here. We’ll need
bsf.jar.
step two: fix gotchasYMMV, but I found that I couldn’t get BSF working without including the
Jakarta Commons-Logging jar. So if you don’t have this floating around, go
here and download it. I experimented, and I could ignore everything except
commons-logging-1.1.jar. If that was missing, BSF kakked.
step three: RhinoSince we’re going to run Javascript, we need an interpreter.
Rhino to the rescue.
Download it. We’ll need
js.jar.
step four: keeping things organizedReady to code? Let’s start with a directory for all of our stuff. Call it
hello_javascript. For the sake of keeping thing simple, set up the sub-structure as follows:
hello_javascripthello_javascript\libYou may be using a fancy IDE, you may be using a text editor and have to graft your classpaths together with chicken wire. The important thing is that your classpath, besides including all of Java’s required stuff, and your own Java classes, also includes
bsf.jar,
commons-logging-1.1.jar and
js.jar.
We’ll put all three in the
lib subdirectory:
hello_javascript\lib\bsf.jarhello_javascript\lib\commons-logging-1.1.jarhello_javascript\lib\js.jarstep five: “Hello, Javascript”Let’s write some Java: create the following subdirectories and put a file called
HelloJavascript.java in it:
hello_javascript\com\raganwald\public\HelloJavascript.javaLet’s give it some code:
package com.raganwald.public;
import org.apache.bsf.BSFManager;
public class HelloJavascript {
public static void main (final String[] argv) {
final BSFManager manager = new BSFManager();
final Object jso = manager.eval("javascript", "(java)", 1, 1, "'hello, Javascript'");
System.out.println(jso.toString());
}
}
Run your new Java application. Did you see that? It interpreted some Javascript
in the JVM without a browser. Check your watch. Did you need more than a quarter of an hour? I didn’t think so.
You can try more ambitious code:
manager.eval(
"javascript", "(java)", 1, 1,
"var f = function (what) { return 'hello, ' + what; }; f('Javascript);");
including other files is an exercise left for the readerI didn’t find an easy way to get Javascript files to include other Javascript files. This isn’t the worst thing in the world, but you certainly don’t want to write anything substantial inside of Java strings. So try experimenting with reading javascript files right off the classpath.
I created a subdirectory called
javascript:
hello_javascript\javascript And you can read Javascript into your strings or Stringbuffers with some fairly simple code, thanks to a utility built into BSF:
import org.apache.bsf.util.IOUtils;
// ...
static String readScript(final String fileName) throws Exception {
final FileReader in = new FileReader(fileName);
return IOUtils.getStringFromReader(in);
}
That reads some script into a string. You can then prepend it to whatever you want to evaluate. Note that if you want to set up some sort of simple checking to make sure that you don’t “include” the same file twice, you will need to write yourself a little framework for that, perhaps using a
Set to keep track of what you’ve already loaded.
garbage in, garbage out


Prototype and Scriptaculous are the Javascript libraries that make slick transitions and UI effects easy one-liners. Prototype does more than just make an application look good: it adds Ruby and Smalltalk-like methods for handling Hashes, Arrays, and the DOM.
This book is one of the fastest ways to get up to speed on taking Javascript to the next level.
This is nice, and with a little work you could make a program that reads paths to Javascript files off the commend line and executes them. But to make things really interesting, you want to find a way to get Java data into your JavaScript and do something useful with the results, not just print it as a String.
BSF provides a way to inject objects into the scripting language’s environment, so you could use that facility. When writing automated unit tests for that particular project, I chose a simpler route: I serialized the data into JSON and used that to call a Javascript function directly via BSF:
manager.eval("javascript", "(java)", 1, 1,
"myJavascriptFunction(" + myJSONString + ");");
This is a
really bad idea if your JSON is handed you from an insecure source, such as a public web page calling you back via
XMLHttpRequest, but if you trust your source, this works wonderfully.
Now what do you do with the result? If you are generating something esoteric like a Javascript function, I have no idea. In my own case, I return all values as simple trees of Hashes (Javascript objects without any special methods) and Arrays. I convert those into Java trees of Maps and Arrays:
import org.mozilla.javascript.NativeArray;
import org.mozilla.javascript.ScriptableObject;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
// ...
static List unwrapNativeArray (final NativeArray na) {
return new ArrayList<Object> () {{
for (int i = 0; i < na.getLength(); ++i) {
add(unwrapNative(na.get(i, null)));
}
}};
}
static List unwrapPrototypeArray (final ScriptableObject sObj) {
return new ArrayList<Object> () {{
final List<Object> sObjIds = Arrays.asList(sObj.getAllIds());
for (int i = 0; sObjIds.contains(i); ++i) {
add(unwrapNative(sObj.get(i, null)));
}
}};
}
static Map unwrapObject (final ScriptableObject sObj) {
return new HashMap<String, Object> () {{
for (Object id: sObj.getAllIds()) {
put(id.toString(), unwrapNative(sObj.get(id.toString(), null)));
}
}};
}
static Object unwrapNative (final Object obj) {
if (obj instanceof NativeArray) {
return unwrapNativeArray((NativeArray) obj);
}
else if (obj instanceof ScriptableObject) {
final ScriptableObject sObj = (ScriptableObject) obj;
final List<Object> sObjIds = Arrays.asList(sObj.getAllIds());
if (sObjIds.contains("keys")) { // a prototype enumerable/hash
return unwrapObject(sObj);
}
else if (sObjIds.contains("flatten")) { // a prototype enumerable/array
return unwrapPrototypeArray(sObj);
}
else return unwrapObject(sObj);
}
else return obj;
}
Check your watch. Are you still under fifteen minutes? Great!
Labels: lispy, popular
Style Matters
Art Matters
—button on a fellow coffee lover’s lapel
Giles Bowkett has written an
entertaining opinion piece, wherein he argues that the gulf between sophisticated programmer and fundamentalist programmer is far, far wider than the hairline between fundamentalist programmers that use different programming languages.
You know, blog posts with sufficient depth (anything more than “How to call a COM+ object on Windows 2000 Server using CORBA over SOAP from IE 7”) are Rorschach Blots. If you’re predisposed to affinity, you read your own thoughts in them. And indeed, I immediately liked the article, because I thought he was saying that fundamentalist programmers use
Blub, and that when he talks about the limitations of languages he was saying that
people matter more than tools.

Essentials of Programming Languages
is an in-depth exploration of the deep ideas behind programming languages. This is a hands-on book: you’ll build a series of interpreters that actually implement each language feature, building a full understanding of the way programming languages actually work.
This is critical for learning how to use these features in languages that have them, but better than that it also teaches you how to greenspun them into languages that don’t!
Of course, he’s actually saying a little of that but also a little of something else, and what’s really important
to me are the ways in which what he says are different than the things I was already thinking. A mirror is a dangerous tool for experiencing the universe!
And then I read
Andrew Norris’ reply, and what struck me was that Andrew may have had the opposite reaction to Giles’ opinions: he seemed to fixate on the differences between what Giles was saying and what Andrew was thinking. But closely reading Andrew’s impassioned response, I recognize so many similarities to the two points of view that I’m shocked they aren’t both the same person.
As a colleague used to say, these two men are in
violent agreement. Giles is talking about the importance of the programmer’s deep understanding, and Andrew is talking about the importance of the programmer’s deep understanding as well!
They both seem to agree without reservation on the utter dead-endedness of the fundamentalist, “all innovation stopped with the invention of language X” thinking, for all values of X (including Lisp!). You can sort out the differences for yourself, they pale in comparison.
StylingWhat I find most valuable (in the sense of expanding my own thinking, not just confirming what I already thought I knew) about Giles’ post is his assertion on the value of a personal style, of having a voice, on finding something unique to contribute, something that shines through the specific language and platform you use.
Isn’t that related to the entire
meta-programming/DSL thingie? This idea that the language is a platform for a program so personal it deserves its own programming language?
Isn’t that
exactly what both Giles and Andrew are promoting? Giles says “develop your personal style that is independent of the programming language.” Andrew says “use the tool that fits the job.” Well, what could be more personal than bending a tool to become a new tool customized for the problem? What could be more “polyphonic?”
And Giles’ example of Avi Bryant brings up an important component of developing a personal style. Giles pointed out the common thread in Avi’s work. he suggests it is Avi’s style, and it is. But Avi can’t use that style on any arbitrary problem. To a certain extent, Avi selects problems that fit his style.
To develop a style, you must be ready to apply it to problems, and you must also be prepared to walk away from problems where it really doesn’t apply. Otherwise, you will become the jack of all trades but the master of none. of course a great programmer can use shell scripts, C, and Python. But the programmer who has equal facility with all three problem domains is inferior to the programmer who “knows enough to be dangerous” in most domains but has obtained mastery in a very few of them.
So, Develop your style. Choose problems wisely. And learn your tools deeply, so that you may choose the right tools to express your style.
Labels: lispy
Steve's road into the godawful swamp
You’ve all read about the Road to Lisp. I was on it for a little over a year. It’s a great road, very enlightening, blah blah blah, but what they fail to mention is that Lisp isn’t the at the end of it.
Lisp is just the last semi-civilized outpost you hit before it turns into a dirt road, one that leads into the godawful swamp most of us spend our programming careers slugging around in.
I guarantee you there isn’t one single Lisp programmer out there who uses exclusively Lisp. Instead we spend our time hacking around its inadequacies, often in other languages.
Labels: lispy
From Abstraction to Zipf
Alpha…Damien Katz raised an interesting question the other day: he wondered whether elements of computer programs followed a
Zipf distribution. In other words, he wondered whether the most frequently used item occurred a lot more than the next most frequent, which occurs a lot more than the third most frequent, and so forth.
That’s an interesting question. Is it true? And if it is, what does that tell us about composing programs?
Looking at computer programs, some things seem to follow Zipf’s law and others don’t. In C++ programs, loop indices called
i probably are an order of magnitude more common than those called
j. Unless you use
Apps Hungarian, of course. But are
for loops that much more common than
if statements? That might vary from place to place within a single program, but overall they might be roughly equal in frequency.
If we step up a level and look at idioms, my gut tells me that there is a strong Zipf distribution. Although each program will differ, some programs might make heavy use of lists, others of maps. Some might do a lot with closures, others with objects. These are fairly generic, of course.
If we have a look at the more domain-specific items in a program, we really see a Zipf distribution. Some things pop up again and again and again in a program, others are rare.
Why Abstract?1The most important tool we have for composing programs is
abstraction. When we create an abstraction for something, we get one and a half wins (a
sesqui-win).
The full win is that we get to take a common element and place its responsibility in once place. For example, if your language gives you a
map function or method, there is one place for the code that applies a function to each element of a collection and produces a collection of the results. Unlike a “pattern,” you don’t have to re-implement map every time you use it, you just use it. The centralization of responsibility in a single place is very powerful way to
separate concerns.
The half win is that we don’t need to understand its implementation, we only need to understand its interface. I honestly don’t know whether Ruby’s built-in
map method applies itself to a collection one at a time from the start to the end or in some other order. I suspect it’s from start to end for ordered collections, but I don’t know and I don’t have to know.
This is only half a win, because for anything too complicated to explain on a PowerPoint slide where you are promising the newest silver bullet, abstractions have a nasty habit of leaking.
Leaky abstractions force you to understand the implementation to use them.
There are some drawbacks to abstractions. As noted, sometimes the abstraction leaks. In that case, you often have to look up what an abstraction does. If there was no abstraction and the implementation was right there in the middle of your code, you wouldn’t have to look it up. So when you see:
class BlogPost << ActiveRecord::Base
has_many :comments
# ...
end
You obviously to need to know that
has_many creates methods like
comments,
comment_ids= and
comments<< automagically for you. In fact, that is kind of the point of
has_many. But if you are doing anything non-trivial, you also need to know whether
has_many actually
defined comments,
comment_ids= and
comments<< or whether it has merely modified
BlogPost to
handle those messages. In that sense, the abstraction leaks.
You get much the same thing if you build a Java class hierarchy that is umpteen levels deep festooned with abstract base classes, factories, and dependency injection. In one sense you get a really tight abstraction. In another sense, you have a big leak: you need to know its internals to get any real work done.
Another drawback to abstractions fall out of their strength. If you haven’t seen a particular abstraction before, you don’t know what it does and how it works. Contrary to popular hype, self-documenting names are not enough. If you are a Rails user, raise your hand if you can tell me what all of the various options for
has_many are and what they do. You have to learn each abstraction’s interface, just as you had to learn what a
for loop does for you.
The win (not having to know how it works) is also a loss (not knowing what it does).
An abstraction is an abstraction is an abstractionI think most people agree that named subroutines (or functions, or procedures) are an excellent abstraction. First-class functions and anonymous functions seem to be an acquired taste (just one taste, and your programs will quickly acquire them). Objects—in the strong sense of polymorphic programs—are well-regarded abstractions.
Macros, DSLs, and other means of writing your own mini-language provoke the wildest claims of productivity and equally baseless fears of unreadable, fragile, “magic” code built out of gossamer dreams.
What these all have in common is that these functional abstractions live happily within the syntax provided by the host programming language. They abstract function without abstracting syntax. For the most part, non-syntactic abstraction is uncontroversial.
There seems to be debate over syntactic abstraction. Macros, DSLs, and other means of writing your own mini-language provoke the wildest claims of productivity and equally baseless fears of unreadable, fragile, “magic” code built out of gossamer dreams. The basic “problem” with syntactic abstractions like
has_many in Rails or
let in Scheme is that syntactic abstractions force you to learn their interface.
They are right in your face, especially if they are very brief. Consider:
BlogPost.find_or_create_by_name('From Abstraction to Zipf').comments << Comment.new('bleh!', 'anonymous')
Don’t bother searching for the
find_or_create_by_name method or the
comments<< method in the code base.
All the same, syntactic abstractions are
just like functional abstractions. You get some wins, and you pay some costs in understanding and potential “leakiness.”
Zipf to the rescueSo should we zealously abstract everything we can? Or should we conservatively avoid “magic” like syntactic abstractions?
Every abstraction has a fixed learning cost. That cost is amortized over all of the instances we’ll encounter. So you need to learn how
has_many works just once, and then you benefit every time you read it or write it. This is why abstractions built into a programming language are big wins: you amortize their cost over every program written in the language. Isn’t that why Java and C# look so much like C++ which looks so much like C? They were offering abstractions that have been paid up for years.
Zipf’s law tells us that each application only needs a very few new kinds of abstractions to get most of the benefits. There are only a few low-hanging fruits, but each is plump and juicy.
An abstraction built into a framework like
Rinda or
MapReduce is amortized over fewer programs. A domain-specific abstraction for a single organization is amortized over a very few programs, and an abstraction built into one program is only amortized over instances in the one code base. Unsurprisingly, opportunities for abstraction follow the Zipf distribution.
Of course, that distribution is self-similar: the abstractions
within a program follow the same distribution as the abstractions
within an organization. So even if you aren’t inventing a new programming language, Zipf can help.
Quite simply, it pays to aggressively abstract the few things that are encountered the most frequently. In a CRUD web application, schema relationships like
belongs_to are incredibly frequent. There’s a big win from creating a syntactic abstraction, even if the learning curve looks daunting to newcomers. Creating a domain-specific language for database schema changes is also a big win.
Note that we aren’t saying that
has_one,
belongs_to, and
has_many appear more than 50% of the time. They may be quite infrequent. The point is that they are far more frequent than something else you could abstract, and the other thing will take just as much time to learn to read but will pay off far less often.
But should you create a DSL for
list comprehensions in your CRUD application or just use the language’s built-in collections? I would say, you would need an application that uses an awful lot of lists before a syntactic abstraction for lists is a win.
It might be, but the nice thing is, it probably won’t be close: Zipf’s law tells us that each application only needs a very few new kinds of abstractions to get most of the benefits. There are only a few low-hanging fruits, but each is plump and juicy.
…OmegaI think it is a win to use aggressive abstraction techniques like metaprogramming, but only if you restrain yourself to abstracting the very few things that appear most frequently in your code base. The things that are used less frequently should be abstracted using less aggressive techniques.
The resulting program may not be magically 50% shorter than an unabstracted program, because the “long tail” of idioms and patterns are simply not worth abstracting away even though we have the tools for doing so.
You can probably tell whether a program has the proper level of abstraction just by checking its distribution.
If you find a big abstraction framework for one thing while frequently used idioms are expressed using patterns instead of abstractions, you are probably looking at an improperly abstracted program. You have to wade through verbiage to read everything, and the one time you don't need to abstract the code away, you have to chase definitions through indirection.
And in a well-written program, I would expect that the things that occur most frequently are expressed at the highest and most powerful levels of abstraction, while things that happen infrequently are expressed in a more direct and explicit way. Such a program would be easy to read throughout.
- Why Abstract?
is a delightful and wonderfully brief book that explains the appeal of abstract art in a simple and direct way. It’s one of the treasures on my bookshelf and I urge you to find a copy through a used book seller.
- Suprised? There is no footnote two. But if there was, it would be to mention that Paul Graham’s incredible book On LISP: Advanced Techniques for Common LISP
is freely available online. This is a must read for anyone interested in composing syntactic abstractions in any language.
Labels: lispy, ruby
Don't have a COW, man? What Haskell teaches us about writing Enterprise-scale software
Berlin Brown asked:
I read programming.reddit religiously but I rarely see something that could be used in a non-startup environment. Am I wrong, or are you considering deploying a haskell enterprise web application? And if the stuff discussed isn’t relevant for the next 5 years (i.e. an erlang based webapp) will it ever be relevant?
Yes, I use what I read on
programming.reddit.com in my day job. That’s one of the reasons I
have this day job: it’s part of what I do to sift through all of the cool stuff and find the things that are practical today.
Since you mentioned
Haskell:
Consider a multi-threaded application with shared memory, like a really big web server that has some big shared collection of things in memory. From time to time you add things to the big collection, from time to time you remove them.
Pessimism and coarse-grained lockingOne way to arbitrate multiple threads is to have one copy of the collection with strict locking protocols that apply to its coarse-grained operations like
add,
remove, and
fetch.
In a language like Java, you can
synchonize those methods and you’re done. You have just implemented
mutex locking: only one thread can use the collection at a time. If one thread is fetching something from the collection, all other threads must wait, even if all they want to do is fetch things as well.
That sucks
tbng qvpx, especially if you do lots of reading: why should thread
546 have to wait to fetch something just because thread
532 is currently fetching something?
1Read and write locksThe next improvement is to have two kinds of locks:
read locks and
write locks. Two or more threads can lock the collection for reading, but if any thread locks the collection for writing, all of the other threads have to wait until it is done.
This works for about 17 clock ticks, and then you fix the bug by adding a new rule: if a thread wants a write lock but one or more threads have read locks, it has to wait, but any other threads that want read locks can’t have them. Even though the only threads with locks have read locks, they still have to wait.
The thread waiting to write gets a kind of pending write lock that blocks all other threads from taking out new locks. And then you fix the next bug by saying that all threads waiting wait in a priority queue so that the read threads aren’t starved by write threads and the write threads aren’t starved by read threads.

Purely Functional Data Structures
takes you step by step through the design and implementation of copy on write collections. These collections
can be used in purely functional languages, but they are just as useful in multi-paradigm languages like Java, Ruby, or Python handling multiple threads and performance optimization. The author’s
thesis is available on line for free.
You now have a system that is pretty fast a long as you don’t write things very often. For example, you could build a fairly nice cache using read-write locking provided it is tuned so that you get lots of hits and only rarely have to drop things from the cache or add things to the cache. But if you’re doing something like maintaining a big index in memory where you have to make lots of updates, the writes will slow everything down.
These kinds of locking protocols are called
pessimistic protocols: you assume bad things will happen and prevent them from happening up front by blocking threads from executing until it is safe to proceed.
Multi-version concurrency controlAnother way to arbitrate multiple threads is to make copies of the collection whenever you perform an update.
2 You maintain multiple
versions of the collection. When a thread needs the collection, it grabs the latest copy. When it wants to
remove or
add elements, it writes a new copy without disturbing an existing copy.
This works really well in that threads that only want to read are never blocked. They always run at full speed, even if another thread is in the middle of an update. Hand-waving over how you figure out the whole “latest copy” thing, this scheme doesn’t work so well for threads that write.
The problem is one of
serialization: this word means, if some set of operations takes place on the collection, the result must be the same as if the operations were conducted one at a time on the collection. There is no guarantee of the
order of the operations, just that the result is the same as if they had been carried out in some order.
Let’s use an example. Say our collection is a Map. It starts empty:
{ }
Operation
A adds an element:
{...a: "A"...}
As does operation
B:
{...b: "B"...}
And operation
C:
{...c: "C"...}
If we start with an empty hash and perform all three operations, the result should be
{ a: "A", b: "B", c: "C" }, exactly the same result as if each operation were performed serially, one after the other. But what happens if each operation is performed by a thread that grabs the initial copy,
{} and writes its result back to the collection? Something called a
race condition: the result will be
{ a: "A" },
{ b: "B" }, or
{ c: "C" }, with the “winner” being the last one to write its result.
We can fix this problem in a couple of ways. One way is to keep the versions so that reading threads work at full speed, but use mutexes for write locks so that only one thread can write at a time. That’s simple, and if you can figure out a cheap way to make copies, works pretty well.
That’s your pessimistic protocol again (threads that write have to wait on other threads that write), but it’s a huge win for threads that read.
Making this work is tricky. Copying the entire thing is expensive, so you need to do clever tricks where you only copy the things that change and share the things that don’t change. And you can get a big, big win if you can avoid write conflicts by arbitrating conflict at a finer grain. For example, a
HashMap uses a set of linked lists. If two different threads write to different “buckets,” you can merge their results rather than rolling one back and starting again.
One of the best books ever written on the subject of implementing fault tolerant concurrency (either on a single system or a distributed network) is
Concurrency Control and Recovery in Database Systems
.
Don’t be fooled by the word “database”—the techniques described are just as useful for implementing distributed algorithms like MapReduce, concurrent data structures like high-performance collections, or for building multi-threaded communication systems based on queues.
Like all classics, it’s also available
online for free.
There is a lot of extra overhead for this if a thread wants to write while another thread is reading a version, so it is only a big win if writes are fairly rare. Remember, one of the big wins is that reads
never wait on writes because they work with immutable versions of the collection.
Depending on how many threads you have, what kinds of operations are most likely, and other factors, this kind of system can be orders of magnitude faster than coarse-grained pessimistic locking.
Sometimes you want a slightly different protocol. The aforementioned is often called
single write, many reads. It requires threads that plan on writing to know in advance they need to write. But for something like a cache, you don’t know you need to write until you miss the cache. And then it’s too late to get a write lock.
Optimism and many writes, many readsThe easiest way to avoid having to pre-declare whether a thread is a reader or a writer is by letting all of the threads do as they please. They all get the latest version and chug happily along.
When they are finished, if they never executed an
add or
remove we let go of the copy of the collection and we’re done. If a thread wants to write, it makes a copy as above and writes to its copy. But it doesn’t have to grab a lock while it is writing, so writes don’t wait on other writes.
Now, if a thread
has executed a write (an
add or
remove), when it is done we check the result to see if it violates serializability.
For example, we can number our versions. Let’s say that
{} is version
0. The first thread to finish, let’s say it’s the thread performing operation
B, creates its result:
{ b: "B" }. Now it checks the collection to see if anyone has updated it since
B read the collection. The collection is still at version
0, so
B can write its result.
B writes
{ b: "B" } to the collection and calls it version
1.
Next
A finishes and notices that the collection is at version
1. This is a problem, because
A is working with an updated version
0, so it has to throw out its work, fetch version
1, and try again. This is exactly the same thing as using a source code control system like
Subversion to resolve conflicts.
This many reads, many writes strategy is called an
optimistic protocol because you do work in the hope that nothing will cause you to throw it out and try again. It’s a big win if writes do happen at the same time, but they rarely conflict.
For example, if you have a good strategy for
merging writes, this is huge.
So what?Well, it would be great if you didn’t have to reinvent the wheel and have to work out all of the implications when you want to make a really fast shared collection in a multi-threaded environment. What you want is someone to share a wealth of experience about how to make really fast copies of things by only changing the little bits that you change instead of the whole thing, and so forth.
I like languages which say, “No, you don't want to write it the way you’re thinking. There’s a vastly better way to solve this whole class of problems.” Me: brain explodes
Where do you go for that kind of information? How about to people who spend all day thinking about collections that cannot change because their programming languages are purely functional?
Yes, what I’ve just described is
exactly how languages like Haskell implement mutable collections like dictionaries and lists. In purely functional languages, collections never change. Adding something to a collection is really creating a new collection with an extra element. This is the exact same implementation that we need for building optimistically locked collections in a multi-threaded environment!
Haskell teaches us extremely useful techniques for writing Enterprise-scale software.
And more techniques: Hard-core concurrency considerations
- Now, you might be saying, “what a waste, this is like locking a table in a database when we should be locking rows.” But if you look at the database closely, it does lock the table when you perform certain actions like deleting a row. Or it does something more complicated, and now that you’ve read the entire post, you know what it really does.
- Making Copies on Writes is called copy on write semantics, or COW for short. Chew on that for a while.
Labels: java, lispy, popular
An Approach to Composing Domain-Specific Languages in Ruby
Whoa! This looks like a long post with a lot of code snippets. Am I going to have to do a lot of hard thinking, or can I just relax and enjoy a good rambling essay?This is a bit long, probably (like all my posts) 200% longer than necessary. If you just want to see a neat DSL that implements Haskell and Python’s List Comprehensions written in Ruby, just
scroll to the bottom.
If I do bother to read it all, will I learn some neat hacks?Yes, but you could learn them just as well by reading the
source code directly.
So the benefit of reading the whole thing is...?The List Comprehensions DSL is the
what. The source code is the
how. But the essay is the
why.
Reading the whole thing will take you through some of the pitfalls of writing DSLs and explain why I chose my particular workarounds.
Furthermore,
there are a lot of corners in Ruby where you can easily assume that things work one way, but really they don’t. If you actually try the snippets on your computer, you’ll have a much better chance of remembering where the pitfalls are. That’s why I tried to give a working example for every point, rather than just explaining things in words.
Of course, if you have no interest in writing your own Domain Specific Languages in Ruby just yet... this isn’t meant as a
popular essay, rather it’s meant as an experience report for fellow
practitioners. And honestly, there’s a world market for maybe five tools for writing DSLs in Ruby.
But since you’re here, the essay starts below!
An Approach to Composing Domain-Specific Languages in RubyRuby is often touted as a good language for writing Domain-Specific Languages (“DSLs”). There are a few arguments in favour of writing a DSL as part of an application.
The first argument that comes to mind is that if the application’s domain experts have a specific natural language or jargon of their own, writing a DSL makes it easy for programmers and domain experts to collaborate. While it is rare to find substantial applications entirely written by non-programmers at this time in
any language, it is quite feasible for non-programmers to write or validate portions of an application representing its “business rules” or domain logic, while programmers maintain its infrastructure.
include StarbucksDSL
order = latte venti, half_caf, non_fat, no_foam, no_whip
print order.prepare
—Building Domain Specific Languages in Ruby
Another argument in favour of a DSL is that even when non-programmers are not involved directly in coding an application, the programmers themselves often have a jargon of their own to describe entities, algorithms and data structures in the application. Having portions of the application written in a language closely resembling the programmer’s own jargon makes it easy for them to read each other’s work and understand its intent.
Successful examples of DSLs embedded within existing languages and frameworks include
Ruby on Rails’ ActiveRecord, where statements such as:
has_and_belongs_to_many :Bar
validates_presence_of :blitz
some_bars = Bar.find_by_tavern_license(license_number)
Are self-documenting to anyone familiar with relational models.
The final argument I’ll repeat here is that a DSL is a very effective way to separate the
what from the
how of an algorithm.
Separation of concerns is a desirable property of good programs, and DSLs provide this separation very clearly. In the ActiveRecord examples above, the exact mechanisms of relating tables, validating records, and performing searches is “abstracted away” from the code where the programmer declares how she would like the results used.
Freedom is SlaveryDSLs can be hacked together quickly in Ruby (whether they can be made sufficiently robust for your production needs may require considerably more care). Hacking a DSL together with little effort is a benefit, especially when prototyping: sometimes the best way to design a DSL is to try to use it, so you can discover what you need to express.
Some developers have raised the concern that extensive use of “magic” features leads to code that cannot be understood or maintained.
1 My own feeling is that DSLs lead to code that is easier to understand, not more difficult to understand. This leaves an argument about maintenance. Some techniques for meta-programming, such as extending core classes like Array, have what you might call “non-local effects.”
For example, two different pieces of code might try to extend the same core class, interfering with each other. Each works in isolation and passes all of its unit tests. But when plugged into a larger application that uses them together, they break.
Lispers are among the best grads of the Sweep-It-Under-Someone-Else’s-Carpet School of Simulated Simplicity.
—Larry Wall
Another problem occurs with extending the
Kernel class or creating “top level” methods to be used as verbs in a DSL. You end up with name space crowding: you must be very careful that you do not redefine en existing method.
To fix this problem, the code that implements the DSL needs to be contained so that it does not interfere with other code. We can still implement verbs as methods, but we must implement those methods in separate objects, classes, or modules.
Zen in the Art of Program MaintenanceAn established technique for implementing methods in objects is to define the methods and then execute a block of code using
instance_eval so that it has access to the object’s methods.
I’m trying to get the Zen of building DSLs using Ruby. After reading a dozen or so pieces referenced by my favourite search engine, I have a feeling I’m still not quite getting it.
—Don Box
You know, code expresses an idea better than words express an idea… when the idea is about coding. Please try this example in irb. Don’t just skim the text and nod: there’s a powerful learning mechanism at work when you physically do things as you’re learning, even if it’s just copying, pasting, comparing the result in one window to the text in another, and so on:
def bjarne
'Barney'
end
dsl = Object.new
def dsl.phred
'Fred'
end
plus = ' plus '
print dsl.instance_eval {
phred + plus + bjarne
}
##### "Fred plus Barney"
What does this show? Well, we have created a way to use a method defined in our
dsl object, a local variable
plus, and a top-level method
bjarne. We can imagine scaling this up to defining a rich DSL in our DSL object and being able to mix verbs from the DSL with instance variables and other methods as we please.
Touching back on the subject of containment, we have defined
bjarne in
Kernel. Now
bjarne is essentially global. If we already defined
bjarne somewhere else, we just clobbered it. And if we later run a piece of code that defines
bjarne, we’ll clobber our own version.
phred is different. It’s defined inside of an object, and it doesn’t conflict with any other
phred we define elsewhere.
Great! So… Can we cite a few examples of this technique in action (such as
Jamis’ post where he calls
phred and
bjarne examples of
Sandboxing and
Top-level methods) and end the post here?
No. The code above looks fine. But there is a hidden problem with this sandboxing technique:
MyDsl = Object.new
def MyDsl.phred
'Fred'
end
class ClientCode
def bjarne
'Barney'
end
def friends
plus = ' plus '
MyDsl.instance_eval { phred + plus + bjarne }
end
end
ClientCode.new.friends
##### -:15:in `friends': undefined local variable or method `bjarne' for # (NameError) from -:15:in `friends' from -:20
WTF?! This looks just like our top-level example, but we’ve placed our code inside of a
ClientCode method. And
bjarne is a method in ClientCode: this way we can continue to separate concerns, keeping
phred inside our DSL and
bjarne inside of the class where we are using the DSL. But it doesn’t work.
Why instance_eval breaks (in tedious detail)As you know, everything in Ruby is either a variable or a method (how it figures out the difference is a major irritation). When you invoke a method, you are actually sending a
message to a
receiver.
2 Sometimes you name the receiver (
some_object.a_method), and there is no ambiguity.
But when you just name the method (like
bjarne), Ruby tries to find the method for itself. It does so by looking to see whether it is an instance method, in which case it behaves like
self.bjarne. If not, it looks to see whether
bjarne is top-level, in which case it calls that method in the
Kernel. See for yourself:
def foo
'top level foo'
end
def bar
'top level bar'
end
class Test
def bar
'instance method bar'
end
def test
p foo
p bar
end
end
Test.new.test
##### "top level foo" "instance method bar"
See? It looks for instance methods and then for top-level methods if it can’t find anything. (Again, we are hand-waving over the pesky problem with instance variables in the case where we don’t use
()). What’s the problem? Well, I actually mis-described what happens. Here it is again, with more precision:
It looks for methods defined in the object
self, and then for top-level methods if it can’t find anything. Of course,
self is the current object. Unless it isn’t: That’s what
instance_eval does: it evaluates a block but it changes
self to point to its receiver instead of the object where the code is executing. Everything else stays the same. One more example to show the mechanism:
def foo
'top level foo'
end
def bar
'top level bar'
end
class Test
def bar
'instance method bar'
end
def blitz
'current object blitz'
end
def test
p foo
p bar
o = Object.new
def o.blitz
'redefined self blitz'
end
p o.instance_eval { blitz }
p o.instance_eval { 'bar within o gives: ' + bar }
end
end
Test.new.test
##### "top level foo" "instance method bar" "redefined self blitz" "bar within o gives: top level bar"
Now we see: when we use
instance_eval, we route around our current object and all of our methods are ignored within the block. Ruby really only has two levels of scope: whatever belongs to self and whatever belongs to Kernel.
This state of affairs is unsatisfactory: we would like to introduce a DSL in such a way that we retain access to all of our methods without kludges (like storing the current object in an instance variable).
Nesting ScopesYou can think of the current
scope as being nested inside of the top-level scope.
instance_eval doesn’t change the scope for things like local variables, it just points
self elsewhere.
What we want is a new scope for our DSL nested inside of the current scope. So when we search for a method, we should check the DSL. If we don’t find it there, check the current object’s scope. If we don’t find it there, check the top-level.
Those who do not learn from the History of Lisp are doomed to repeat it.
Oops.
John McCarthy called from 1960. He wants Lisp’s
dynamic scoping back. Yes, our new feature is almost fifty years old. This is why either a through grounding in CS theory or a hobbyist’s interest in the history of programming are important for programming: much of what we want to do has already been done before, and sometimes in unexpected contexts. Who would have thought that a technique for helping programmers collaborate with Bond Traders has roots in Lisp 1.5?
Here’s an implementation of a nested scope construct that does exactly what we want. You declare a new class that extends
DomainSpecificLanguage, and then you can use methods from your DSL, from your current object, and from the top-level (if you so choose). For example:
require 'dsl'
class MyDSL < DomainSpecificLanguage
def bjarne
'Barney'
end
end
class TheGreat
def phred
'Fredrick'
end
def test
plus = ' plus '
MyDSL.eval { p phred + plus + bjarne }
end
end
TheGreat.new.test
##### "Fredrick plus Barney"
This does
exactly what we want with methods.
There's also a single extension to
kernel, the method
with.
with replaces the
eval method so you can also say:
with MyDSL do
p phred + plus + bjarne
end
The
eval method creates a new instance of your DSL class, so you can track state within an evaluation. For example:
class Censor < DomainSpecificLanguage
attr_reader :ok_on_tv
def initialize (given_binding)
super(given_binding)
@ok_on_tv = true
end
def say something
something.split.each do |word|
@ok_on_tv = false if ['feces', 'urine', 'love', 'pudendum', 'fellator', 'oedipus', 'mammaries'].include?(word)
end
end
end
class GeorgeCarlin
def test
Censor.eval {
say "People much wiser than I have said, I'd rather have my son watch a film with two people making love than two people trying to kill one other."
say "And I of course agree. I wish I know who said it first, and I agree with that."
ok_on_tv
}
end
end
p GeorgeCarlin.new.test
##### "false"
letThe first obvious drawback of this approach is that the blocks we pass to
eval cannot take parameters. For this reason, rumour has it that a method called
instance_exec will be added to Ruby in 1.9. (There are some
implementations available that work in Ruby 1.8 if you would like to experiment.)
The second is that you don’t get anything like nested local variables, a ‘la Pascal, Scheme, or any other language with block structure. Block structure is very powerful: You can use a variable within a particular scope and nowhere else. Here’s a trivial example:
with Let do
let :x => 0, :y => 1 do
assert_equal(1, x + y)
let :x => 2 do
assert_equal(3, x + y)
end
assert_equal(0, x)
end
end
We're using the
with syntax. In the
Let DSL, there’s a new method called
let.
let creates a new DSL within
Let. You can see that re-declaring
x does not clobber the value in the outer scope. That is because when
let wrote a new DSL, it added
x and
y as methods.
So really, that block of code says “Write a new DSL where
x and
y are methods returning zero and one. Execute some code in that new DSL. That code will create
another DSL where
x is a method returning two.”
Because
let defines methods and not local variables, bad things happen when you try to override real local variables. It’s best to use
Let for some things and local variables for others, but not mix the two.
Like what, you ask?
List Comprehensions in RubyA
List Comprehension is syntactic sugar that lets you build collections using set-like notation. For example,
S = [ x | x<-[0..], x^2>3 ] is a list comprehension in Haskell.
Here is a
List Comprehensions DSL in Ruby. Let’s say we’re building up a multiplication table. We want tuples of the form
[x, y, x * y] given
x is in the range
1..12 and
y is in the range
1..12. Let’s write that:
require 'comprehension'
class MultiplicationTable
def twelve_by_twelve
with Comprehension::DSL do
list { [x, y, x * y] }.given(:x => 1..12, :y => 1..12)
end
end
end
p MultiplicationTable.new.twelve_by_twelve
##### [[1, 1, 1], [1, 2, 2], [2, 1, 2], [1, 3, 3], [2, 2, 4] ...
(In everyday use, you don’t need a class and a method for each comprehension: the important bit is
list { [x, y, x * y] }.given(:x => 1..12, :y => 1..12). I just wrote it this way so you could see that comprehensions work fine inside of methods. You can also use more than one comprehension inside of a single
with Comprehension::DSL do... end block: see the unit tests for examples.)
The expression in the block doesn’t have to be a tuple:
class MultiplicationTable
def twelve_by_twelve
with Comprehension::DSL do
list { "#{x} times #{y} is #{x * y}" }.given(:x => 1..12, :y => 1..12)
end
end
end
p MultiplicationTable.new.twelve_by_twelve
##### ["1 times 1 is 1", "1 times 2 is 2", "2 times 1 is 2", "1 times 3 is 3", "2 times 2 is 4", ...
And you can stick a “where” block on the end:
class MultiplicationTable
def twelve_by_twelve_odds
with Comprehension::DSL do
list { "#{x} times #{y} is #{x * y}" }.given(:x => 1..12, :y => 1..12) { (x % 2 == 1) && (y % 2 == 1) }
end
end
end
p MultiplicationTable.new.twelve_by_twelve_odds
##### ... 3 times 5 is 15", "5 times 3 is 15", "7 times 1 is 7", "1 times 9 is 9", ...
Would you like to nest them? Your expression is the interpreter’s command:
class MultiplicationTable
def odds_times_evens
with Comprehension::DSL do
list { "#{x} times #{y} is #{x * y}" }.given(
:x => list { x }.given(:x => 1..12) { x % 2 == 0 } ,
:y => list { x }.given(:x => 1..12) { x % 2 == 1 } )
end
end
end
p MultiplicationTable.new.odds_times_evens
##### ... "2 times 11 is 22", "4 times 9 is 36", "6 times 7 is 42", ...
List Comprehensions and LetWhat is the relationship to
Let? Well,
Let builds the scopes needed for evaluating the where clause and the block defining the elements of the list. Yes, we’ve built a DSL on top of a DSL on top of a DSL. Does this seem like weird trickery? I don’t know why. Do you have
any idea how many levels of abstraction are responsible for you reading this essay right now?
This is what we humans do: we build tools on top of tools. Your browser runs on an OS, possibly in a VM, perhaps in a hypervisor, on top of a BIOS, and on and on. This is the normal state of affairs, not an exception.
Closing RemarksIt is possible to build DSLs in Ruby to facilitate cross-functional teamwork and separation of concerns. Care must be taken to avoid polluting the top-level name space, but it is possible to work within sandboxes and still have access to the current object’s context.
Oh yes, and programming is fun as always
Source CodeUpdate: The copy of dsl.rb has been updated to the latest version. I had committed a rather typical manual synchronization error: I copied the latest file to the wrong directory when I first posted this. Thanks, Justin!How to try it for yourself: Open
DomainSpecificLanguage and Let. Save the text only (not the HTML) as
dsl.rb. Open
Comprehension. Save the text only as anything you like, as long as it is in the same directory as
dsl.rb: I use
comprehension.rb. Run
comprehension.rb.
- I generally call “Bullshit!” on any line of reasoning that sets up a straw man argument just to knock it down. So read on with skepticism!
- Alan Kay has said that he regrets popularizing the notion of “Object-Oriented” programming, and that he should have called it “Message-Oriented” programming.
Labels: lispy, popular, ruby
Why Why Functional Programming Matters Matters
I recently re-read the amazing paper
Why Functional Programming Matters (“WhyFP”). Although I thought that I understood WhyFP when I first read it a few years ago, when I had another look last weekend I suddenly understood that I had missed an important message.
1Now obviously (can you guess from the title?) the paper is about the importance of one particular style of programming, functional programming. And when I first read the paper, I took it at face value: I thought, “Here are some reasons why functional programming languages matter.”
On re-reading it, I see that the paper contains insights that apply to programming in general. I don’t know why this surprises me. The fact is, programming language design revolves around program design. A language’s design reflects the opinions of its creators about the proper design of programs.
In a very real sense, the design of a programming language is a strong expression of the opinions of the designer about good programs. When I first read WhyFP, I thought the author was expressing an opinion about the design of good programming languages. Whereas on the second reading, I realized he was expressing an opinion about the design of good programs.
Can we add though subtraction?It is a logical impossibility to make a language more powerful by omitting features, no matter how bad they may be.
Is this obvious? So how do we explain that one reason Java is considered “better than C++” is because it omits manual memory management? And one reason many people consider Java “better than Ruby” is because you cannot open base classes like
String in Java? So no, it is not obvious. Why not?
The key is the word
better. It’s not the same as the phrase
more powerful.
2 The removal or deliberate omission of these features is an expression about the idea that programs which do not use these features are better than programs which do. Any feature (or removal of a feature) which makes the programs written in the language better makes the language better. Thus, it
is possible to make a language “better” by removing features that are considered harmful,
3 if by doing so it makes programs in the language better programs.
In the opinion of the designers of Java, programs that do not use
malloc and
free are safer than those that do. And the opinion of the designers of Java is that programs that do not modify base classes like
String are safer than those that do. The Java language design emphasizes a certain kind of safety, and to a Java language designer, safer programs are better programs.
“More powerful” is a design goal just like “safer.” But yet, what does it mean? We understand what a safer language is. It’s a language where programs written in the language are safer. But what is a “more powerful” language? That programs written in the language are more powerful? What does that mean? Fewer symbols (the “golf” metric)?
WhyFP asserts that you cannot make a language more powerful through the removal of features. To paraphrase an argument from the paper,
if removing harmful features was useful by itself, C and C++ programmers would simply have stopped using malloc and free twenty years ago. Improving on C/C++ was not just a matter of removing
malloc and
free, it was also a matter of adding automatic garbage collection.
This space, wherein the essay ought to argue that Java compensates for its closed base classes by providing a more powerful substitute feature, left intentionally blank.
At the same time, there is room for arguing that some languages are improved by the removal of harmful features. To understand why they may be improved but not more powerful, we need a more objective definition of what it means for a language to be “more powerful.” Specifically, what quality does a more powerful programming language permit or encourage in programs?
When we understand what makes a program “better” in the mind of a language designer, we can understand the choices behind the language.
FactoringFactoring a program is the act of dividing it into units that are composed to produce the working software.
4 Factoring happens as part of the design. (
Re-factoring is the act of rearranging an existing program to be factored in a different way). If you want to compare this to factoring in number theory, a well designed program has lots of factors, like the number
3,628,800 (
10!). A
Big Ball of Mud is like the number
3,628,811, a prime.
Composition is the construction of programs from smaller programs. So factoring is to composition as division is to multiplication.
Factoring programs isn’t really like factoring simple divisors. The most important reason is that programs can be factored in orthogonal ways. When you break a program into subprograms (using methods, subroutines, functions, what-have-you), that’s one axis of factoring. When you break an a modular program up into modules, that’s another, orthogonal axis of factoring.
Programs that are well-factored are more desirable than programs that are poorly factored.
In computer science, separation of concerns (SoC) is the process of breaking a program into distinct features that overlap in functionality as little as possible. A concern is any piece of interest or focus in a program.
SoC is a long standing idea that simply means a large problem is easier to manage if it can be broken down into pieces; particularly so if the solutions to the sub-problems can be combined to form a solution to the large problem.
The term separation of concerns was probably coined by Edsger W. Dijkstra in his paper On the role of scientific thought.
—Excerpts from the Wikipedia entry on
Separation of ConcernsPrograms that separate their concerns are well-factored. There’s a principle of software development,
responsibility-driven design. Each component should have one clear responsibility, and it should have everything it needs to carry out its responsibility.
This is the separation of concerns again. Each component of a program having one clearly defined responsibility means each concern is addressed in one clearly defined place.
Let’s ask a question about Monopoly (and Enterprise software). Where do the rules live? In a noun-oriented design, the rules are smooshed and smeared across the design, because every single object is responsible for knowing everything about everything that it can ‘do’. All the verbs are glued to the nouns as methods.
In a game design where you have important information about a rule smeared all over the object hierarchy, you have very poor separation of concerns. It looks at first like there’s a clear factoring “Baltic Avenue has a method called
isUpgradableToHotel,” but when you look more closely you realize that every object representing a property is burdened with knowing almost all of the rules of the game.
The concerns are not clearly separated: there’s no one place to look and understand the behaviour of the game.
Programs that separate their concerns are better programs than those that do not. And languages that facilitate this kind of program design are better than those that hamper it.
Power through features that separate concernsOne thing that makes a programming language “more powerful” in my opinion is the provision of more ways to factor programs. Or if you prefer,
more axes of composition. The more different ways you can compose programs out of subprograms, the more powerful a language is.
Do you remember
Structured Programming? The gist is, you remove
goto and you replace it with well-defined control flow mechanisms: some form of subroutine call and return, some form of selection mechanism like Algol-descendant
if, and some form of repetition like Common Lisp’s
loop macro.
Dijkstra’s view on structured programming was that it promoted the separation of concerns. The factoring of programs into blocks with well-defined control flow made it easy to understand blocks and rearrange programs in different ways. Programs with indiscriminate jumps did not factor well (if at all): they were difficult to understand and often could not be rearranged at all.
Structured
68k ASM programming is straightforward in theory. You just need a lot of boilerplate, design patterns, and the discipline to stick to your convictions. But of course, lots of 68k ASM programming in practice is only partially structured. Statistically speaking, 68k ASM is not a structured programming language even though structured programming is possible in 68k ASM.
Structured Pascal programming is straightforward both in theory and in practice. Pascal facilitates separation of concerns through structured programming. So we say that Pascal “is more powerful than 68k ASM” to mean that in practice, programs written in Pascal are more structured than programs written in 68k ASM because Pascal provides facilities for separating concerns that are missing in 68k ASM.
For example: working with listsConsider this snippet of iterative code:
int numberOfOldTimers = 0;
for (Employee emp: employeeList) {
for (Department dept: departmentsInCompany) {
if (emp.getDepartmentId() == dept.getId() && emp.getYearsOfService() > dept.getAge()) {
++numberOfOldTimers;
}
}
}
This is an improvement on older practices.
5, 6 For one thing, the
for loops hide the implementation details of iterating over
employeeList and
departmentsInCompany. Is this better because you have less to type? Yes. Is it better because you eliminate the fence-post errors associated with loop variables? Of course.
But most interestingly, you have the beginnings of a
separation of concerns: how to iterate over a single list is separate from what you do in the iteration.
Try calling a colleague on the telephone and explaining what we want as succinctly as possible. Do you say “We want a loop inside a loop and inside of that an if, and…”? Or do you say “We want to count the number of employees that have been with the company longer than their departments have existed.”
One problem with the
for loop is that it can only handle one loop at a time. We have to nest loops to work with two lists at once. This is patently wrong: there’s nothing inherently nested about what we’re trying to do. We can demonstrate this easily: try calling a colleague on the telephone and explaining what we want as succinctly as possible. Do you say “We want a loop inside a loop and inside of that an if, and…”?
No, we say, “We want to count the number of employees that have been with the company longer than their departments have existed.” There’s no discussion of nesting.
In this case, a limitation of our tool has caused our concerns to intermingle again. The concern of “How to find the employees that have been with the company longer than their departments have existed” is intertwined with the concern of “count them.” Let’s try a different notation that separates the details of
how to find from the detail of
counting what we’ve found:
old_timers = (employees * departments).select do |emp, dept|
emp.department_id == dept.id && emp.years_of_service > dept.age
end
number_of_old_timers = old_timers.size
Now we have separated the concern of finding from counting. And we have hidden the nesting by using the
* operator to create a Cartesian product of the two lists. Now let’s look at what we used to filter the combined list,
select. The difference is more than just semantics, or counting characters, or the alleged pleasure of fooling around with closures.
* and
select facilitates separating the concerns of how to filter things (like iterate over them applying a test) from the concern of what we want to filter. So languages that make this easy are more powerful than languages that do not. In the sense that they facilitate additional axes of factoring.
The Telephone TestLet’s look back a few paragraphs. We have an example of the “Telephone Test:” when code very closely resembles how you would explain your solution over the telephone, we often say it is “very high level.” The usual case is that such code expresses a lot more
what and a lot less
how. The concern of what has been very clearly separated from the concern of how: you can’t even
see the how if you don’t go looking for it.
In general, we think this is a good thing. But it isn’t free: somewhere else there is a mass of code that supports your brevity. When that extra mass of code is built into the programming language, or is baked into the standard libraries, it is nearly free and obviously a Very Good Thing. A language that doesn’t just separate the concern of how but does the work for you is very close to “something for nothing” in programming.
But sometimes you have to write the
how as well as the
what. It isn’t always handed to you. In that case, it is still valuable, because the resulting program still separates concerns. It still factors into separate components. The components can be changed.
I recently separated the concern of describing “how to generate sample curves for some data mining” from the concern of “managing memory when generating the curves.” I did so by writing my own lazy evaluation code (Both the
story and the
code are on line). Here’s the key “what” code that generates an infinite list of parameters for sample beziér curves:
def magnitudes
LazyList.binary_search(0.0, 1.0)
end
def control_points
LazyList.cartesian_product(magnitudes, magnitudes) do |x, y|
Dictionary.new( :x => x, :y => y )
end
end
def order_one_flows args = {}
height, width = (args[:height] || 100.0), (args[:width] || 100.0)
LazyList.cartesian_product(
magnitudes, control_points, control_points, magnitudes
) do |initial_y, p1, p2, final_y|
FlowParams.new(
height, width, initial_y * height,
CubicBezierParams.new(
:x => width, :y => final_y * height,
:x1 => p1.x * width, :y1 => p1.y * height,
:x2 => p2.x * width, :y2 => p2.y * height
)
)
end
end
That’s it. Just as I might tell you on the phone: “Magnitudes” is a list of numbers between zero and one created by repeatedly dividing the intervals in half, like a binary search. “Control Points” is a list of the Cartesian product of magnitudes with itself, with one magnitude assigned to
x and the other to
y. And so forth.
I will not say that the sum of this code and the code that actually implements infinite lists is shorter than imperative code that would intermingle loops and control structures,
entangling
what with
how. I will say that it separates the concerns of what and how, and it separates them in a different way than
select separated the concerns of what and how.
So why does “Why Functional Programming Matters” matter again?The great insight is that better programs separate concerns. They are factored more purely, and the factors are naturally along the lines of responsibility (rather than in Jenga piles of
abstract virtual base mixin module class proto_ extends private implements). Languages that facilitate better separation of concerns are more powerful in practice than those that don’t.
WhyFP illustrates this point beautifully with the same examples I just gave:
first-class functions and
lazy evaluation, both prominent features of modern functional languages like Haskell.
WhyFP’s value is that it expresses an opinion about what makes programs better. It backs this opinion up with reasons why modern functional programming languages are more powerful than imperative programming languages. But even if you don’t plan to try functional programming tomorrow, the lessons about better programs are valuable for your work in
any language today.
That’s why
Why Functional Programming Matters matters.
- And now I’m worried: what am I still missing?
- Please let’s not have a discussion about Turing Equivalence. Computer Science “Theory” tells us “there’s no such thing as more powerful.” Perhaps we share the belief that In theory, there’s no difference between theory and practice. But in practice, there is.
- I am not making the claim that I consider memory management or unsealed base classes harmful, but I argue that there exists at least one person who does.
- The word “factor” has been a little out of vogue in recent times. But thanks to an excellent post on reddit, it could make a comeback.
- So much so that we won’t even bother to show what loops looked like in the days of
for (int i = 0; i < employeeList.size(); ++i).
- Another organization might merge employees and departments, or have each department “own” a collection of employees. This makes our example easier, but now the data doesn’t factor well. Everything we’ve learned from databases in the last forty years tells us that we often need to find new ways to compose our data. The relational model factors well. The network model factors poorly.
Labels: java, lispy, popular, ruby
Haskell, Ruby and Infinity
Languages like Haskell support
lazy evaluation. In principle, you only compute what you actually need, everything else just goes away. If you usually need everything you compute, this may seem like a frill: elegant, even interesting, but having little practical importance. I find it is very much like Tail Call Optimization: if you don’t have it, you code around it. Often it makes no difference, but from time to time there is a case where your code will be clearer and more maintainable if you express yourself succinctly and let the compiler do the work of making it efficient.
This post is a switch from mumbling about
Y Combinators and using trivial cases to explain interesting ideas: I’m going to show some actual code I’m writing. I apologise in advance if this adds so much background that it obscures the message about lazy evaluation: extremely simplistic examples sometimes work against the argument because it is too easy to think of other ways to accomplish the needed results.
I don’t apologise
at all for the unpolished nature of the code. This isn’t a textbook. And besides:
Anybody can say you can’t write. Let no one say you don’t.
—Ken Rand, courtesy of Chalain
I’ve been
hacking some naïve code to cluster data sets.
In computer programming, a hacker is a software designer and programmer who builds elegant, beautiful programs and systems… For some, “hacker” has a negative connotation and refers to a person who “hacks” or uses kludges to accomplish programming tasks that are ugly, inelegant, and inefficient.
—Wikipedia
The clustering algorithm requires a very large, fixed set of curves.
1 I wrote the initial curve generation by building a gigantic list of parameter tuples, and then processing the list into records. Once the “search space” grew beyond a trivial size, the program began to eat enormous amounts of memory. The problem was that I was trying to write the generation code as clearly as possible, and that created a massive thicket of objects that all resided in memory simultaneously.

The curves being generated are paths composed of cubic beziérs. Each segment on the path requires specifications for four different control points.
I rejected the idea of a rewrite into looping, imperative form, I wanted to separate the “list comprehension” code from the “make it run in less than a gigabyte” code. Instead, I chose to use lazy evaluation: the principle is that although the code defines a huge data structure that takes up gigabytes, we only actually evaluate things as we need them, and we discard them when we are done, so the total memory footprint for the lazy form is about the same as the memory footprint for an imperative form.
The easiest way to perform the refactoring—besides a rewrite in Haskell—was to switch all of the arrays I used for a lazy list data structure. A lazy list is a linked list composed of head and tail tuples (known as
cons cells). Where a normal linked list holds an element and the rest of the list, a lazy list holds an element and a function for computing the rest of the list (
SICP calls them “
streams.”)
If a modicum of care is taken not to pin objects down, you can build fantastically large lazy lists and process them at will. In fact, lazy lists can be infinitely large if you provide the appropriate function for generating the list
For that reason, you should never append one lazy list onto another lazy list. Consider
odd = LazyList.unfoldr(1) { |n| n + 2 } and
even = LazyList.unfoldr(0) { |n| n + 2 }. If you append
odd onto
even, the resulting list in theory has both even and odd numbers. But in practice you can never reach the odd numbers because there is an infinite quantity of even numbers at the front of the list.
Instead,
merge them together to produce a lazy union of the two lists. Merge interleaves alternating elements from each list, so the resulting lazy list contains all the elements of each list, and if you take a finite sample, such as the first 1,000 elements, you get 500 of each.
even.merge(odd) => (0 1 2 3 ...). (Other interesting operations that work with infinite lists include
cons,
pairwise and
product.)
Any sufficiently complicated Lisp or Ruby program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Haskell.
My current version of a
LazyList class in Ruby is
here. As soon as I switched from the “eager” code using arrays to the “lazy” code using lazy lists, memory use went down and performance went up. Not as much as a switch to imperative (and not even close to writing a better clustering algorithm), but enough that I can move forward.
Here is an example of a function using arrays to enumerate choices (
distribute(2, 1..4) -> [[1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4]]):
def distribute choices, range
spots = (range.end - range.begin) + 1
return [] if choices <= 0 || choices > spots
remaining_choices = choices - 1
if remaining_choices == 0
return range.to_a.map { |spot| [spot] }
else
(range.begin..(range.end - remaining_choices)).to_a.inject([]) { |acc, first_spot| acc + (distribute(remaining_choices, (first_spot + 1)..range.end).map { |remaining_distribution| remaining_distribution.unshift(first_spot) }) }
end
end
And rewritten with lazy lists:
def distribute choices, range
spots = (range.end - range.begin) + 1
return LazyList.new() if choices <= 0 || choices > spots
remaining_choices = choices - 1
if remaining_choices == 0
LazyList.from_range(range).map { |spot| LazyList.from_element(spot) }
else
LazyList.from_range(range.begin..(range.end - remaining_choices)).mapr(:merge) do |first_spot|
distribute(remaining_choices, (first_spot + 1)..range.end).map do |remaining_distribution|
LazyList.cons(first_spot, remaining_distribution)
end
end
end
end
It’s almost identical.
LazyLists replace arrays,
mapr replaces accumulating a list with
inject, and
cons replaces
unshift, but otherwise it’s the same code. Mission accomplished: we’ve changed the behaviour without having to change the way we express our algorithms. Had we wanted to, we could have supported enough array syntax that Lazy Lists look exactly like arrays, but that isn’t necessary.
And when running the entire generator, the memory footprint is dramatically lower. For example, this small routine quoted above no longer generates an array of arrays: it generates a structure that can generate the lists when needed. Switching the data structure changes the evaluation behaviour of the generator code: it gets us 80% of having an imperative structure without having to tie the code to the implementation.
This is exactly what separation of concerns is all about:
green code here, yellow code there.
Lazy evaluation is the red pillIt’s great to make a change to the code’s behaviour without changing the way we think about the solution to our problem. But you know what? It isn’t
interesting. In fact, if lazy optimization was something we needed on a regular basis, we ought to switch to a
language that does it for us. Implementation details ought to be left to libraries, compilers, and virtual machines. That’s what they’re for.
A language that doesn’t affect the way you think about programming, is not worth knowing.
—Alan Perlis
Now that we hold in our hands a tool for making infinitely long lists, the question to ask is, “how does that affect the way we think about generating sample curves?”
The first thing that comes to mind is to ask why the set of sample curves we generate is finite. It isn’t really finite, there are an infinite number of curves. (And not just any infinity, it’s Aleph One, the children’s book
One Two Three… Infinity
explained that forty years ago.)
What we really want is a
finite sample of that infinite set of curves. Now let’s zoom in and think about a single parameter, the
y value for one of the control points on a curve. Let’s say it’s a rational number between
0.0 and
1.0 inclusive. It’s relatively easy to generate a list of rationals in that range. Say we generate
0.0000...00001 (with jillions of zeroes elided: we want the smallest number our computer can represent), followed by
0.0000...00002 (the second smallest), and so on.
If we leave the computer running for a few months or years, we should get all of the numbers between
0.0 and
1.0 that our computer can represent. That shouldn’t be hard. But even taking into account the finite limitations on a computer’s representation, that list is infinitely long.
Useful samplesWe need to take a finite sample of all of the infinite possibilities in that list.
When working with an infinite list, what we want is that the front part of the list, the part we can access before the heat death of the Universe, contains a useful sample. If we were sampling a list like
(0.0000...00001 0.0000...00002 0.0000...00003 0.0000...00004 ... ) for control points, the first values are not useful at all. We’d just get a line along the
x access.
There are various strategies for generating a more useful list. We could create an infinite list of random values. I’ll outline another method below. But the take-away idea is this: we need lists where the most useful values are first. Working with infinite lists encourage us to think of infinitely sized sets of samples, ordered by usefulness.
Let’s say that when we generate
y, we devise a list like
(0.0 1.0 0.5 0.75 0.25 ... ). We’re successively bisecting the intervals (which is why the library method is called
binary_search). We can stop at any time and we have decent distribution. (And we can pipe dream that in the future, we can train our generator to favour points more likely to be useful to us.) That is much more useful!
Here’s some Lazy List code that does just that: given two seeds and an optional bisection block, it generates an infinite list of values:
def binary_search seed1, seed2, &block
bisection_block = block || lambda { |x, y| (x-y)/2 + y }
LazyList.cons(
seed1,
LazyList.new(
seed2,
delay {
bisect(seed1, seed2, &bisection_block)
}
)
)
end
# infinitely bisects a space, exclusive of the seed values
def bisect seed1, seed2, &block
return EMPTY if seed1 == seed2
half = block.call(seed1, seed2)
LazyList.new(
half,
delay {
merge(
bisect(seed1, half, &block),
bisect(half, seed2, &block)
)
}
)
end
And indeed,
LazyList.binary_search(0.0, 1.0) -> (0.0 1.0 0.5 0.75 0.25 ... ).
The Grand Hotel
Satan, Cantor, and Infinity and Other Mind-boggling Puzzles
is a five-star introduction to Cantor’s work on infinity, including a special treat: a completely different proof that Aleph One is greater than Aleph Zero based on games, very much in the style of Conway’s Surreal Numbers. Currently out of print, but
please get yourself a used copy from a bookseller: you won’t be disappointed!
What happens when we try to combine two infinite lists? For example, we want a list of
(x, y) tuples, generated as the Cartesian Product of our binary search list with itself. If we use a typical breadth-first or depth-first algorithm, we’re sunk. We get
( (0.0 0.0) (1.0 0.0) (0.5 0.0) (0.75 0.0) (0.25 0.0)... ). That has a nice distribution along one axis but still useless. What we need is a way of combining two infinite lists such that they give us a nice useful sample when we take a finite number of elements off the front.
Here’s a table describing how the ‘breadth-first’ algorithm for generating the product of two lists works. Consider two lists of three elements each. In the table below, columns represent one list and rows the other. The numbers represent the order that the product of the lists is generated:
| Breadth-first mapping from elements to integers |
|---|
| 1 | 2 | 3 |
| 4 | 5 | 6 |
| 7 | 8 | 9 |
This works fine for finite lists, but as we saw above it is useless for infinite lists. But looking at it in table form is useful. The product of two lists
is tabular, the problem with a traditional algorithm is the order that we select elements from the table, not that the elements are tabular. Just as we solved the problem of how to sample
y magnitudes by changing the order in which we selected rationals between
0.0 and
1.0, what we need to do with Cartesian Products, what we need to do is select the products in an order that provides useful results.
As it happens, there’s an order that generates useful finite samples when dealing the product of two infinite lists: Instead of generating rows and appending them to each other, it generates
diagonals and
merges them with each other. The first diagonal is
1, 5, 9. The second is
4, 8. The third is
2, 6. And so on. The advantage of this algorithm is that if given two infinite lists, it starts in the upper left-hand corner and supplies values, working its way right and down.
2Here’re the first four values of the product of two binary searches with each other:
[{:x=>0.0, :y=>0.0}, {:x=>0.0, :y=>1.0}, {:x=>1.0, :y=>0.0}, {:x=>0.0, :y=>0.5}]. And the next four:
[{:x=>1.0, :y=>1.0}, {:x=>1.0, :y=>0.5}, {:x=>0.5, :y=>0.0}, {:x=>0.0, :y=>0.25}]. And the next eight:
[{:x=>0.5, :y=>0.5}, {:x=>0.5, :y=>0.25}, {:x=>0.5, :y=>1.0}, {:x=>1.0, :y=>0.25}, {:x=>0.25, :y=>0.25}, {:x=>0.25, :y=>0.75}, {:x=>0.25, :y=>0.0}, {:x=>0.0, :y=>0.75}].
We can take as many samples as we want, and more sample give us more “resolution.” We now have the tools we need to get rid of all of the limits and finite combinations and focus on describing what we’re trying to generate without intermingling a lot of generating code.
And I’m thinking about eternity
Some kind of ecstasy got a hold on me
—Bruce Cockburn, Wondering Where The Lions Are
Now that we can generate an infinite list of useful magnitudes, plus we can combine infinite lists in a useful way, we can define an infinite list of sample curves. Here’s the exact definition of an infinite list of partial beziérs (there are only three control points because the origin of the beziér is always the terminus of the previous beziér in the curve we are generating):
def sample_cubic_beziers
magnitudes = LazyList.binary_search(0.0, 1.0)
control_points = LazyList.cartesian_product(magnitudes, magnitudes) do |x, y|
{ :x => x, :y => y }
end
LazyList.cartesian_product(control_points, control_points, control_points) do |p1, p2, p3|
{
:p1 => p1,
:p2 => p2,
:p3 => p3
}
end
end
That’s really simple, and really easy to understand.
3 The domain-specific code that defines a cubic Beziér is concise and readable. And the infrastructure code that handles combinatorics is shunted away out of sight where it doesn’t clutter up our thinking.
So what have we seen?
- Lazy evaluation solved a performance problem without needing an extensive rewrite of our generation code;
- Taking the red pill, infinite lists allowed us to radically simplify the original code.
Are there some opportunities to steal an idea like lazy evaluation from other languages and use them to simplify your code?
- The data is a set of
(x,y) tuples where x is time and y is a magnitude. The effort remaining in a sprint backlog is an example of this kind of data set. I need a function that computes the distance between any two data sets. I already have a function that computes the distance between a curve through the xy space and a data set, so if I build a lot of curves and then take the distance between each curve and each data set, I can find the distance between pairs of data sets by searching for the curve with the lowest total distance.
This operation is On2, but that’s why we invented distributed processing. Also, this operation is done infrequently relative to operations making use of the clusters, something like Google’s indexing being far less frequent than searching Google. And of course, when my MacBook overheats and burns its way through my desk, I can come back and replace the brute force operation with something more intelligent.
Well, all this requires generating the sample curves. I have code that makes curves out of Beziérs, provided I supply coördinates for the control points. the curves are actually ActiveRecord models. So all I needed was some simple code that generates all of the possible combinations and then write them to the database. It’s the kind of thing candidates write in job interviews where employers actually care about their stone cutting skills.
- This problem is congruent to the proof that the set of all points on a plane is the same size as the set of all points in a line.
- And it could be simpler yet: a “list comprehensions” DSL for lazy lists
is on the wish list would make things far more readable.
Labels: lispy, ruby
Programming Language Stories
CA fellow needs a dog house, so he hires a contractor. This guy is a C programmer who’s between jobs at the moment, and he says forget those wimpy “agile” wooden dog houses, he can build the dog house out of bricks, it will last forever and is stronger than any wood structure.
The client is doubtful, but that sounds very impressive, and the contractor says this is how all the real-world dog houses are made, so he agrees. Well, it takes forever: it seems every time the contractor thinks it’s finished, they discover there’s a chink between the bricks and there’s a leak. With wood you just trim to fit, but with bricks you have to plan everything in advance so that the bricks line up just right.
But Spring turns to Summer turns to Autumn, and the job is done. The client pays up, and the contractor’s about to leave when he stops and thinks for a moment before speaking.
“Hey,” the contractor says, “You paid for a pallet of bricks, and I used all but one. It’s a shame if it goes to waste. Why don’t I see if I can use it somewhere?” The client asks like what, and before you know it the contractor has scored a contract to build a brick path for the dog from the back door to the dog house. He orders some more bricks, gets to work right away, and this time things go a little better.
The contractor is now doing C++ on the side. And he uses some new bricklaying techniques he learned from
Effective C++
: there are these funny angle braces everywhere thanks to his path building templates. Soon that job is done and the contractor collects his fee. He’s about to leave, but he reaches into his tool chest and pulls out a brick.
“Hey,” the contractor says, “You paid for another pallet of bricks, and I used all but one on the path. It’s a shame if it goes to waste. Why don’t I see if I can use it somewhere?” The client is happy with the path, asks like what, and before you know it the contractor has scored another contract, this time to build a brick wall enclosing the yard. He orders some more bricks, and gets to work right away.
Meanwhile, the contractor has become a Visual C++ programmer and reads MSDN magazine religiously. He applies his new skills to building the wall: he has all these fancy tools that can put up sections of scaffolding when you push a button. of course, there are huge holes you have to fill in yourself, and the scaffolding has all these crazy joins and angles, it’s hard to change anything or understand how the entire wall fits together, but it feels like you’re getting a lot done because the system seems to move a lot of bricks for you.
Well, the wall takes forever, and it doesn’t look that good when it’s finally done. But the client pays up and the contractor is ready to leave. But he stops for a moment and pulls a brick out of his tool chest.
“Hey,” the contractor says, “You paid for two more pallets of bricks for the wall, and I used all but one. It’s a shame if it goes to waste. Why don’t I see if I can use it somewhere?”
The client has had enough of this, and he decides he’s going to show the contractor how he feels. “Give me that brick,” he orders. The surprised contractor hands it over. “I don’t give a rat’s ass about this brick” the client huffs, and he throws the brick clear out of his yard.
“I need to get work done, not talk to contractors all day about bricks. And besides, I’ve talked to the big landscape contracting companies, and they all say nobody is using bricks, everyone has switched to sharp tools and netting. Now get out.”
Ruby on RailsJust inside the gates of heaven, St. Peter sits at a desk checking people in.
Peter: “Welcome to heaven. Programming language?”
The man at the front of the line says, “
Smalltalk
.”
Glancing at his clipboard, Peter says, “Room 33. Be very quiet as you pass Room Six.”
The process repeats itself with the next person in line:
Peter: “Welcome to heaven. Programming language?”
Person #2: “
Common Lisp
.”
Peter: “Room 17. Be very quiet as you pass Room Six.”
The next person moves to the front of the line with a look of curiosity on her face.
Peter: “Welcome to heaven. Programming language?”
Person #3: “
Python
.”
Peter: “Room 54. Be very quiet as you pass Room Six.”
Person #3: “Why do you keep telling us to be quiet as we pass Room Six?”
Peter: “Because the
Ruby on Rails
People are in Room Six, and they think they’re the only ones here.”
—Snarfed from Eric Sink’s Baptists and Boundaries
JavaA mother was preparing the pot-roast for Sunday’s big family dinner. Before searing it and placing it in the pan, she carefully sliced the ends off. Her three year-old daughter asked “Mommy, why do you cut the ends off the roast?”
She answered, “My mom taught me to do it that way, and it’s delicious, so it must be a good idea. Maybe the juices from the meat mix with the vegetables?”
Everyone sits down for dinner, and when Father is serving the roast, the daughter remembers her question. She turns to Grandma and asks, “Grandma, why did you teach Mommy to cut the ends off the roast?”
Grandma thinks for a moment and says, “What a delightful question! I always used to cut the ends off the roast, it’s how
my mother taught me. I don’t know why she did that, there must have been a good reason.” Grandma sits for a moment, remembering her mother. “Well,” she continues, ” there must have been a good reason. Now eat your dinner before it gets cold!”
The holidays roll around, and they’re having dinner at Grandma and Grandpa’s house. “Hey!” Grandma says to the little girl, “You know what, I was going through your great-grandmother’s things, and I found the old roast pan. We sure made some good pot roast in it. Let’s make pot roast with it tonight in her honour!”
They get out the pan and wash it up. It’s old, and well-seasoned. Grandma looks it over. “It’s smaller than I remember. I was a little girl, and everything looks bigger when you’re small!”
They put the roast on the counter before searing it. Just then, Grandpa walks by. “Do you want me to cut the ends off the roast?” he asks. “It’s the only way you’ll get it to fit in that small pan.”
Grandma and the little girl look at each other. And they smile.
SchemeA fellow is on vacation, and he decides to ride one of those “scenic” double-decker buses that drive around from attraction to attraction. The upper level is open-air, and he’s the only one up there. He looks around, and decides that instead of re-reading his copy of
The Little Schemer
, he’s going to really enjoy himself: he lights a cigar.
No sooner has he lit the cigar when a woman comes huffing and puffing upstairs, carrying one of those sub-miniature dogs people use as fashion accessories. She takes one look at our man and demands he put the cigar out: it is
offensive to the dog.
“Madam,” our man asserts, “this cigar is not offensive. Let me tell you its secrets.”
“Eschewing inferior mass-market tobacco, I order whole, functional leaves—at great expense and with difficulty—from a small collective in Cuba that still grow cigar tobacco the old-fashioned, original way.”
“Unsullied by automatic rolling machines or other push button devices, I roll each cigar, myself, by hand. Using secret tobacco-macro techniques that are impenetrable to the typical ignorant cigarette smoker, I am able to pack as much smoking pleasure into a single cigar as would take a carton of cigarettes—nay, it is impossible to enjoy a smoke from regular cigarettes as much as I enjoy the elite, exclusive flavour of my cigars.”
The woman is not impressed. “If your cigars are so good, how come nobody smokes them?” she asks. “If your cigars are so good, how come so many clubs and restaurants permit cigarettes but forbid cigars?”
Our fellow is so irate he begins to splutter. “If you or your dog are unable to understand that the purity, the perfection achieved long ago on a remote island far from consumerism and mass productio—” But he is unable to finish, for the woman imperiously sweeps the cigar from his mouth and flings it over the side of the bus.
“That’s what I think of your pompous cigar.” She tells him.
He glares at her for a moment, then years of oppression, of being forced to live with the hated cigarettes against his will, bring him to a boil and he commits an unthinkable act. With a wrench, he seizes the dog and flings it after his cigar. There is a moment of uncomprehending shock, and then the woman begins to howl.
Well, the bus driver hears all this and slams on the brakes. For a moment everything is higgledy-piggledy, and just as the woman’s wails return to their full volume, everyone is shocked to see the dog come trotting up, unharmed.
And what do you know: the dog has—nestled gently in its mouth—the most unexpected thing:
Gur Oevpx.
If you enjoyed these stories, you may also like
Why You Need a Degree to Work For BigCo and more recently,
A Venture Capitalist passes away peacefully, and....
Labels: java, lispy, ruby
Why Ruby is not an acceptable implementation of Lisp
The other day, while actually writing software (as opposed to
talking about writing software), I wrote the following:
distance = lambda do |given_scale|
(magnitudes.map { |pair|
(pair.evidence - (pair.flow * given_scale)) ** 2
}).inject { |acc, distance| acc + distance }
end
If you are curious,
distance is a function that computes the distance between a curve (called a ‘flow’) and some evidence, given a float that scales the flow. It’s used for finding a curve that ‘looks like’ a set of data points.
Can you spot the bug? I’ll expunge all of the excess and give you some code you can try yourself:
require 'test/unit'
class TestRubyNames < Test::Unit::TestCase
def test_lambda_clobbering
foo = lambda { (1..3).map { |foo| foo ** 2 } }
assert_nothing_raised(NoMethodError) { foo.call() }
assert_raise(NoMethodError) { foo.call() } # WTF?
end
def test_local_clobbering
foo = [1, 4, 9]
bar = lambda { (1..3).map { |foo| foo ** 2 } }
bar.call
assert_equal(3, foo) # WTF?
end
end
Is this what you expect? Ruby
looks like a nice, block-structured language with lexical scope. But its standard implementation has serious issues. I believe these are
well-known and understood issues, but issues nonetheless.
One more example. I swore I wouldn’t write anything else about fixed-point combinators for a very long time, but try this exercise at home. Here’s my
curry combinator:
require 'test/unit'
class ExempliGratia < Test::Unit::TestCase
CURRY = lambda { |f, a| lambda { |*b| f.call(a, *b) } }
def test_clean_up_loose_ends
maker = lambda { |f|
lambda { |func_with_me| CURRY.call(func_with_me, func_with_me) }.call(
CURRY.call(lambda { |inner_func, me, *args|
inner_func.call(CURRY.call(me, me)).call(*args) }, f)) }
factorial = maker.call(
lambda { |f| lambda { |n| n.zero? && 1 or n * f.call(n-1) } }
)
assert_equal(120, factorial.call(5))
end
end
Looks good. Now let’s prove that
CURRY is referentially transparent by replacing every
CURRY variable with the lambda:
maker = lambda { |f|
lambda { |func_with_me| (lambda { |f, a| lambda { |*b| f.call(a, *b) } }).call(func_with_me, func_with_me) }.call(
(lambda { |f, a| lambda { |*b| f.call(a, *b) } }).call(lambda { |inner_func, me, *args|
inner_func.call((lambda { |f, a| lambda { |*b| f.call(a, *b) } }).call(me, me)).call(*args) }, f)) }
factorial = maker.call(
lambda { |f| lambda { |n| n.zero? && 1 or n * f.call(n-1) } }
)
begin
factorial.call(5)
puts 'fine'
rescue SystemStackError
puts 'wtf?'
end
Hunh? Ruby
looks like a nice language supporting anonymous functions, but sometimes they work and sometimes they don’t.
I strongly suspect that this issue has something to do with the fact that with the ‘naked’ lambdas we’re creating new functions with every call, whereas the version using a CURRY variable re-uses the same function, so the system stack overflows in one case and not the other.Update: Mike Harris proved that this second problem is the same as the first problem: once again, variables are clobbering each other all over the place. In Ruby 1.8, blocks do not create new scope,
at all.
Here's Mike's submission, indented to fit:
def test_full_anonymity
assert_equal(120, (lambda { |f0|
lambda { |func_with_me|
(lambda { |f1, a1|
lambda { |*b1| f1.call(a1, *b1) }
}).call(func_with_me, func_with_me)
}.call(
(lambda { |f2, a2|
lambda { |*b2| f2.call(a2, *b2) }
}).call(
lambda { |inner_func, me, *args|
inner_func.call(
(lambda { |f3, a3|
lambda { |*b3| f3.call(a3, *b3) }
}).call(me, me)).call(*args)
}, f0))
}).call(
lambda { |f| lambda { |n| n.zero? && 1 or n * f.call(n-1) } }
).call(5))
end
Stop your sobbingI’m
not saying Ruby is an unacceptable Ruby.
I’m
especially not saying “See, that’s why we should stick with language X.” If language X doesn’t even try to support this kind of programming, it’s a lame argument to say that it’s a better choice. That’s like saying that your 1972
Pinto is safer than my 2004 MX-3 because the Pinto doesn’t have air bags, and air bags can injure small children if they are sitting in the passenger seat.
If you try to bend Ruby to do Scheme-ish things, you will run into some corner cases, some places where it does the unexpected. Can you figure them out and avoid them (some people will say avoiding corner cases makes for a
Leaky Abstraction)?
I think if you’re writing Ruby code in Ruby, you can easily avoid these problems. That’s why Ruby is an acceptable Ruby.
Why isn’t Ruby an acceptable implementation of Lisp?But why isn’t Ruby an acceptable implementation of Lisp? (Besides the lack of support for
hygienic macros, of course!)
The problem is that Lisp or Lisp-style programming is all about metaprogramming. Code that writes code. Code that evals code. It’s easy to look at something like the curry combinator and say “no programmer would ever write that convoluted a function in production.” But macros and code generators routinely combine snippets of code to produce monstrosities that a human wouldn’t even considering writing.
When writing a piece of code that writes code, you are heavily dependent on the underlying implementation being regular and corner-case free. The whole point of writing code-writing code is that you intend for it to generate variations and combinations on your templates or scaffolds. And it’s only a matter of time before you are combining template A with generator B and meta-programming method C.
When the underlying language is robust, none of this matters. The result works. If you look at it under the hood, you’re horrified at the result. But frankly, that isn’t an argument against meta-programming, it’s an argument
for it: let the machine do the dirty jobs we don’t want to do, and let us deal with a nice abstraction like
A plus B plus C that just works, never mind what the code looks like.
(Don’t even think about arguing that we should do away with code that writes code. That debate was won by the code generation folks back when COBOL and FORTRAN and ALGOL were young. From time to time you’re welcome to pull out your assembly if you like, but we use High Level Languages because we want the machine to write the messy crap for us.)
My fear is that code that writes code might one day write code that clobbers a variable.
Or code that writes code might write some nested lambdas that overflow the stack.And that’s why Ruby is not an acceptable implementation of Lisp.
Well, ok, I confess: the title is a bit of muckraking.
Ruby is a pretty good implementation of Lisp, and anyone who has two brains to rub together can point out that the issue is not whether there’s some corner case “gotcha,” but
whether using Ruby is, on the whole, more beneficial than not using Ruby.
For what it’s worth, I consider corner cases like this a strong endorsement of trying new languages like Ruby: it’s like finding out that your sailplane has a tendency to go into an inverted flat spin when you loop at an unacceptably slow speed. It’s terrifying when it first happens, but you get through it and then you realize that the only reason you got into trouble was that you were pushing yourself hard.
Labels: lispy, ruby
Guest Blogger: Tom Moertel derives the Y combinator in Ruby
In a comment on But Y would I want to do a thing like this?, Tom Moertel derived an elegant version of the actual Y combinator in Ruby. I reproduce his words here with permission.
Tom uses the alternate syntax for call,
[...]. I avoided this to keep clear of a tiresome debate about operator overloading. (On one side: those who hate it. On the other side: those that hate Ruby disallowing overloading (...)).
Ok Tom, take it away!
Here’s my original Ruby definition:
def y(&f)
lambda { |*args| f[y(&f)][*args] }
end
To get rid of the recursive call
y(&f), let’s replace it with a variable
yf, whose value we’ll say is magically equivalent to the result of the recursive call. To effect the binding of
yf, we’ll need to wrap the method body in a new lambda expression and pass in the magic value:
def y(&f)
lambda { |yf|
lambda { |*args| f[yf[yf]][*args] }
}[ <<< placeholder for yf: we'll fill this hole next >>> ]
end
So, what value should we pass in for
yf? Because the variable represents the value that would have been computed by calling
y(&f), which is just the body of the
y method, we can fill the hole with a duplicate of the body:
def y(&f)
lambda { |yf|
lambda { |*args| f[yf[yf]][*args] }
}[ lambda { |yf| lambda { |*args| f[yf[yf]][*args] } } ]
end
And we have our Y combinator:
fac = y { |rec| lambda { |n| n < 2 ? 1 : n * rec[n-1] } }
fac[5] # ==> 120
While the combinator works, its implementation repeats itself: there are two copies of the lambda expression
lambda { |yf| lambda { |*args| f[yf[yf]][*args] } }
in the method body. To factor them out, let’s let the variable
x represent this expression. Then the method body simplifies to
x[x], which is very satisfying. To set
x to the correct value, however, we need to add a wrapper lambda that binds
x to the original expression. Putting it all together, we arrive at the final version:
def y(&f)
lambda { |x| x[x] } [
lambda { |yf| lambda { |*args| f[yf[yf]][*args] } } ]
end
And it, too, works:
acc_fac = y { |rec| lambda { |n,acc| n < 2 ? acc : rec[n-1, n*acc] } }
tail_fac = lambda { |n| acc_fac[n, 1] }
tail_fac[5] # ==> 120
Cheers! —
TomLabels: lispy, ruby
But Y would I want to do a thing like this?
Choose life. Choose a job. Choose a starter home. Choose dental insurance, leisure wear and matching luggage. Choose your future. But Y would I want to do a thing like that?
Writing about
first-class functions and their
compatibility with object-oriented programming naturally leads to the
Y combinator. And that is the point where eyes glaze over and soft, snoring sounds rise from RSS readers everywhere.
But please bear with me, this essay is not really about the Y combinator, it’s about learning new things and expanding our capacity to think.
Sharpening the sawYears ago I picked up Steven Covey’s book
The 7 Habits of Highly Effective People
. If the book is a test out of seven, I really wasn’t doing very well.
If you’ve read the book, you probably remember that he talked about “Sharpening the Saw,” investing in your own abilities. That’s incredibly important, but I don’t need to tell you that. If you exercise with programming katas, or learn a new programming language once a year, or pick up a book like
The Reasoned Schemer
and actually go through the exercises, then you are already in the top 1% of software developers for personal skills improvement. (Sorry, certifications
don’t count. They are the classic case of doing the wrong thing for the wrong reason!)
New ideas—by which I mean, new to you—are an important way to sharpen your saw. If for no other reason than this: the brain needs regular exercise to perform at or near its potential. Learning new things keeps you sharp, even if you don’t directly use the things you learned.
Others have suggested that learning Lisp is beneficial to your programming skills in its own right. That’s one good way to sharpen your saw. But I add to that an important caveat: to obtain the deepest benefit from learning a new language, you must learn to think in the new language, not just learn to translate your favourite programming language syntax and idioms into it.
Think differentThe interesting thing about that is that almost by definition, if you see something in, say, Lisp that solves a problem you already have, you won’t learn much from the Lisp code. It is tempting to think that Lisp (or any other language) will somehow do what you’re already doing in some wonderfully magic way that is obviously better. But no, that isn’t how it really works.
For your problems are tuned to your existing tools. You simply can’t imagine problems that your tools can’t solve well, much less can’t solve at all. That’s why there are so few continuation-based web servers. Who’s going to invent one unless they have a programming paradigm with continuations?
And worse, when a new tool is applied to a problem you think you know well, you will probably dismiss the things the new tool does well. Look at how many people dismiss brevity of code. Note that all of the people ignore the statistics about the constant ratio between bugs and lines of code use verbose languages. Look at how many people dismiss continuation-based servers as a design approach. Note that all of them use programming languages bereft of control flow abstractions.
Thus, to truly learn a new tool, you must not just learn the new tool, you must
apply it to new kinds of problems. It’s no good learning to replace simple iteration with first-class functions: all you’ve learned is syntax. To really learn first-class functions, you must seek out problems that aren’t easily solved with iteration.
The Why of YWhich leads me back to fixed point combinators. They appear to have no practical (as in making money) use. And that’s why I’m suggesting to you that you figure out how to make one in your language of choice. The very fact that the problem is far outside of your realm of “practicality” guarantees that you will learn something. You won’t be simply applying your same-old, same-old techniques and patterns to a slightly new problem.
Start your research with Richard P. Gabriel’s
The Why of Y. Try porting his examples directly to your favourite programming language. If what you want to use is too brain-damaged to support closures, you may need to do a little greenspunning and build a little
functor infrastructure.
Don’t be dissuaded if you have to follow the functor route: you are learning far more about your language and about programming in general than the shmoes that settle for learning five new buzzwords related to the latest WS-* interoperability with XPath 3.x.
If you prefer a
fun approach to learning, you can do not better than Raymond Smullyan’s
To Mock a Mockingbird
: an enjoyable romp through the world of combinatory logic. After reading this book, you will have mastered the S, K, I, Y, and other combinators. Added bonuses include a safe that can only be opened by applying Gödel’s Incompleteness Theorem to its combination. How can you read this book and
not learn?
Eating my own dog foodI thought of a few things to say along these lines last week and then I abruptly realizing I was asking you to “Do as I say, not as I do.” What good is recycling problems I first encountered in University textbooks two decades ago? I put this post aside and set to work on a problem of my own.
I set out to write a function for making recursive functions—a function
extensionally equal to the Y combinator—in Ruby. The ultimate goal is to take something like:
lambda { |f| lambda { |n| n.zero? && 1 or f.call(n-1) } }
And be able to have it call itself recursively. In this case, to compute the
factorial function.
This is trivial, given that Ruby supports named recursion, but if you want to write a fixed-point combinator you want to write a function that makes recursive functions
without using the host language’s support for named recursive calls. In other words, you are bootstrapping named recursion out of anonymous first-class functions.
1There are important theoretical implications of being able to do this, but the killer reason to try it is to learn.
I started my quest for a function for making recursive functions with a rather trivial observation based on OO programming and the Curry function:
require 'test/unit'
class ExempliGratia < Test::Unit::TestCase
CURRY = lambda { |f, a| lambda { |*b| f.call(a, *b) } }
def test_recursive_curry
maker = lambda { |func_with_me|
CURRY.call(func_with_me, func_with_me)
}
assert_equal(120, maker.call(lambda { |me, n| n.zero? && 1 or n * me.call(me, n-1) }).call(5))
end
end
In OO some language implementations,
this (or
self) is a hidden parameter passed to each method. Thus, there’s a parameter—
me in the example code—that is added for handling recursion. If you write a recursive function—like the venerable
factorial—with the extra
me parameter, a trivial currying operation evaluates it recursively without any need for names.
This is obviously deficient. As noted above, we want to write
factorial like so:
lambda { |n| n.zero? && 1 or f.call(n-1) }
We’ll need an
f from somewhere, and just as our Scheme colleagues do, we’ll bind one as a parameter in an enclosing lambda. So we want to write:
lambda { |f| lambda { |n| n.zero? && 1 or f.call(n-1) } }
And somehow this should be transformed into a working
factorial function. For the test-driven crowd, we want to write:
def test_clean_up_loose_ends
maker = ...
factorial = maker.call(
lambda { |f| lambda { |n| n.zero? && 1 or n * f.call(n-1) } }
)
assert_equal(120, factorial.call(5))
iterative_factorial = maker.call(
lambda { |f| lambda { |n, acc| n.zero? && acc or f.call(n - 1, n * acc) } }
)
tail_factorial = lambda { |n| iterative_factorial.call(n, 1) }
assert_equal(120, tail_factorial.call(5))
end
Of course, we need some code for
maker. And the
iterative_factorial case shows that
maker works for functions with more than one parameter. The solution I came up with is:
CURRY = lambda { |f, a| lambda { |*b| f.call(a, *b) } }
maker = lambda { |f|
lambda { |func_with_me| CURRY.call(func_with_me, func_with_me) }.call(
CURRY.call(lambda { |inner_func, me, *args|
inner_func.call(CURRY.call(me, me)).call(*args) }, f)) }
The source code with each transformation from beginning to end is
here (I strongly suspect that this “curry combinator” is actually the Y combinator with a huge amount of cruft hanging off it).
Unique or derivative, crap or craft, the process of getting it to work has enriched my mind by forcing me outside of my usual problem space. I still can’t think of a practical application for what I’ve just written. But I know I’ve stretched myself.
And now back to you: perhaps you’re rushing off to try to implement a fixed-point combinator from first principles. Perhaps your plan is to code the canonical examples in your usual language. Those are both good paths. But whether you follow them today or not, remember the underlying principle exemplified by the fixed-point combinator:
Do not dismiss impractical or weird problems. While you may not have an immediate application for the code you write to solve such problems, you are maximizing your learning when you venture outside of your usual problem space.
- Named recursion is stuff like
foo = lambda { |...| foo.call(:bar) }. It takes advantage of the host language’s variable binding to recurse. If you want anonymous recursion, you should be able to assign the same lambda to another name and have it work just as well, as in: fufu = lambda { |...| foo.call(:bar) }. That won’t work if you are relying on Ruby’s name for foo.
p.s. Don’t miss
Tom Moertel’s derivation of the Y combinator in Ruby.
Labels: lispy, passion, popular, ruby
HOF or OOP? Yes!
It’s interesting how the brain works: I wrote an essay that I thought was
an explanation of what first class functions and closures are, and why some people would like to see them added to the
Kitchen Sink of Languages.
Then I read
From Functional to Object-Oriented Programming, which suggests that my essay seems to be making a general statement about Functional Programming as an advance over Object Oriented Programming.
1Nothing could be further from my intention or my opinion. Smalltalk, for example, has a clean and powerful syntax for first-class functions. And those first-class functions are objects, as is
everything in Smalltalk. I am tempted to rewrite my essay using Smalltalk: it seems that using
Ruby, another language where everything is an object, is not making this point strongly enough. In my mind, it is easy to use languages that provide the OO paradigm as well as the first-class function paradigm. At the same time.
What I actually saidSo what I actually said is: “Here are these things called
first-class functions. This “thing” is where functions are exactly the same kind of thing as everything else in a language. This is useful, here is why.” If functions are first class, functions aren’t magically more special, nor are they second-class citizens in a language (like those brain-damaged primitives in crippled languages). If a language is an OO language, this implies that first-class functions in such a language are objects and that programming with them
is OO programming.
You’ll notice that I am not using the words “Functional Programming.” That’s because
I’m not talking about Functional Programming. I’m talking about closures and first-class functions. I imagine that these things are present in all contemporary Functional Programming Languages, but that has nothing to do with my essay about the value of having first-class functions in an OO language: Smalltalk has had them in an OO language
for nearly thirty years, and while it wasn’t the first OO language, it is the canonical definition of class-based OO.
That’s all I said. If you want to debate the merits of pure functional programming languages like
Haskell against OO languages like Ruby, you have to take that up with someone like
Tom or
Eric.
What I don’t understandI don’t understand the author’s objection to higher-order functional programming. He(?) says something about higher-order functions and type inference as being bad. I don’t get the argument. It seems like he is saying that the constructs are somehow too deeply nested, that it is too easy to make mistakes. I may not understand him correctly, for I have trouble seeing how this is different from having data structures with a thicket of HAS-A relationships.
Somehow we program just fine in business with complex data schema, and we manage to keep track of things just fine. Strong typing and compiler support helps. So does IDE support. So does drawing diagrams. How is “a function taking an integer and a function taking two integers, returning a function taking a list of integers and returning an integer” somehow more complex than “a record containing an integer and that has many records containing an integer”?
The argument seems to be (and I am open to correction, since what I think he’s saying is so obvious that I worry I’m misunderstanding it) that naming things makes all the difference, as in “a customer record with an ID has many Sales Records, each of which has an ID.” That does sound easier to understand.
But of course, nothing stops you from naming things if you have first-class functions in a type checked OO language, does it? I am not a practitioner, but I have been working my way through
The Little MLer
, and it is part of the ML programming paradigm to name first-class functional types for precisely the reasons the author seems to advocate.
Anonymity considered harmful?So perhaps the objection is to having too many anonymous things, to having too many values without names. There is certainly an appropriate balance to be sought. One language designer seems to dislike anonymous functions, to the point where his language won’t let you have any that won’t fit on a single line. He has an
opinionated language.
I have seen similar arguments with non-functions. That control flow branches should not be nested too deeply (you know,
case and
if and the evil ternary operator). I often use the “extract variable” refactoring to simplify expressions and make them more readable. Naming a part of an expression provides a certain level of abstraction that improves comprehension.
Types and expressions involving functions are no different, and if that’s what the author is saying I endorse his perspective.
That statement, however, is somewhat orthogonal to whether first-class functions should exist in a language. As long as you can name them, you have the tool to write comprehensible code. If your language has types, then you ought to be able to name types involving first-class functions. As long as you can name functional values and types, you can apply all the same style guidelines to first-class functions that you apply to record types.
In ConclusionFirst-class functions are a natural fit with OO, as evidenced by their presence in OO languages that aren’t glorified PDP-11 assemblers with some OO stuff bolted on the side by people with
very little OO and/or GUI programming experience. First-class functions can be used to write clean and legible code, using all of the same techniques we use for writing clean and legible code with records, objects, and other types.
From Functional to Object-Oriented Programming has some interesting points about FP and OOP and their historical context. This has nothing to do with my original post, but I think those words are worth reading and considering. The post raises some caveats about complex and anonymous first-class functions that are obviously sensible: we have noticed the same things with other forms of expression in programming.
- I will come clean here: when I think of OO, I am not thinking of C++ or Java, especially as they are practised in business. I’m thinking of languages like Self, Actor, and Smalltalk.
Labels: lispy
Closures and Higher-Order Functions
There has been a great deal of interest in
closures lately, driven in great part by the fact that there is talk of adding some form of anonymous functions to the Java. Most of the time, people talk about “adding closures” to Java, and that prompts a flurry of questions of the form “what is a closure and why should I care?”
The discussion around closures tends to go on and on about the “closing over” of free variables and only lightly touch on the biggest change to Java: functions as first-class objects with a lightweight syntax for creating them. Making it easy to do something basic like define a new function is more than just a little syntactic sugar: it makes it easy to do new things with functions that were impractical when you needed a lot of boilerplate to make anything work.
Without understanding functional programming, you can’t invent MapReduce, the algorithm that makes Google so massively scalable.
I’m going to try to explain first class functions using Ruby (it
is possible to write code that does exactly the same thing using the current Java feature set, however the result is so wordy that it obscures the basic idea being presented: call it accidental complexity, or perhaps
yellow code.)
Ruby is a good language for demonstrating features that ought to be in Java. Like Java, Ruby uses squiggly brace syntax. Like Java, everything in Ruby is an object—whoops, Java has primitives. Okay, like Java, functions are represented as objects.
In Java you write:
interface IFromIAndI {
Integer call(Integer a, Integer b);
}
IFromIAndI add_two_integers = new IFromIAndI() {
public Integer call(final Integer a, final Integer b) {
return a + b;
}
};
(The Java convention is to name things in
lowerCamelCase, but we’ll ignore that. If you need to print this essay on a dot-matrix printer you may want to make some changes first.)
In Ruby you write the function as:
add_two_integers = lambda { |a,b| a + b }
Later on, when you want to call your function in Java, you write:
add_two_integers.call(35, 42);
And if you like semicolons, you write the
exact same thing in Ruby:
add_two_integers.call(35, 42);
You can do the same thing with multiplication:
multiply_two_integers = lambda { |a,b| a * b }
First Class FunctionsIn the examples above, functions look a little like methods. The Java version is obviously implemented as a method. But what we did in both cases was assign the resulting function to a variable. In Java, assigning a method to a variable is not particularly easy (it is possible using reflection).
Anything that can be assigned to a variable is a
value. If it can also be passed as a parameter or returned from a method (or function), we say it is a
first class value. Functions as first class values, or first class functions, are very interesting. For example, what can we do passing a function as a parameter to another function?
Hmmm. Well, I am breaking a cardinal rule of selling something. We’re talking about shiny new toys without identifying a problem to be solved. Let’s talk about my favourite problem: writing the same thing more than once, violating the
DRY principle.
Here are two pieces of similar Ruby code:
adder_wth_acc = lambda { |acc, list|
if list.empty?
acc
else
adder_wth_acc.call(acc + list.first, list[1..-1]) # [1..-1] returns a copy of the list without the first element
end
}
adder = lambda { |list|
adder_wth_acc.call(0, list)
}
adder.call([1, 2, 3, 4, 5])
And:
multiplier_with_acc = lambda { |acc, list|
if list.empty?
acc
else
multiplier_with_acc.call(acc * list.first, list[1..-1]) # [1..-1] returns a copy of the list without the first element
end
}
multiplier = lambda { |list|
multiplier_with_acc.call(1, list)
}
multiplier.call([1, 2, 3, 4, 5])
What do they both do? Pretty much the same thing: they accumulate the result of some binary operation over a list of values.
adder accumulates addition, and
multiplier accumulates multiplication. You could call this a “Design Pattern.” If you did that, you would use the exact chunk of code everywhere. I would call that retrograde. Didn’t our predecessors invent the subroutine so we could eliminate writing the exact same piece of code over and over again?
Why can’t we do the same thing? Well, we can. A subroutine does the same thing over and over again, but it takes different parameters as it goes. What is different between adder and multiplier? Ah yes, the adding and multiplying. Functions. What we want is a function that takes a function as a parameter.
Well, we said that with first-class functions, functions are values and can be passed as parameters. Let’s try it:
folder = lambda { |default_value, binary_function, list|
fold_with_acc = lambda { |acc, list|
if list.empty?
acc
else
fold_with_acc.call(binary_function.call(acc, list.first), list[1..-1])
end
}
fold_with_acc.call(default_value, list)
}
Now we can use our function that takes functions as a parameter:
folder.call(0, add_two_integers, [1, 2, 3, 4, 5])
folder.call(1, multiply_two_integers, [1, 2, 3, 4, 5])
This is
much better. When functions can take functions as parameters, we can build abstractions like
folder and save ourselves a lot of code. Note that this would be a lot harder to read if we had to surround all of our functions with Object boilerplate in Java. That’s one of the key reasons why ‘syntactic sugar’—making it brief—is a big win.
And you know what? Functions are values, not just variables that happen to hold functions. These work just as well:
folder.call(0, lambda { |a, b| a + b }, [1, 2, 3, 4, 5])
folder.call(1, lambda { |a, b| a * b }, [1, 2, 3, 4, 5])
There’s just one problem (actually two, but I’m saving one for later): everywhere you use our new
folder function, you need to remember that
add_two_integers needs a default value of zero, but
multiply_two_integers needs a default value of one. That’s bad. Sooner or later you will get this wrong.
What we need is a way to call
folder without having to always remember the correct initial value. Should we extend our understanding of a function to include a default initial value for folding? If we’re thinking in Java, maybe our
IFromIAndI interface needs a
getDefaultFoldValue? I think not. Why should a function know anything about how it’s used? And besides, as we build other abstractions out of functions we’ll need more stuff.
If we aren’t careful, we’ll end up implementing the
Visitor pattern on functions, and all of our brevity will go out the window. No, what we want is this: in one place we define that folding addition starts with a default value of zero and in another place we say we want to fold, say,
[1, 2, 3, 4, 5] with addition. Then when we want to fold something else with addition, like
[2, 4, 6, 8, 10], we shouldn’t have to say anything about zero again.
Adding CurryWhat we need is a function that folds addition. Didn’t we say that functions are values that can be returned from functions? How about a function that makes a folding function? We should pass it our initial value and our binary function, and it should return a function that performs the fold without needing an initial value as a parameter:
fold_coder = lambda { |default_value, binary_function|
fold_with_acc = lambda { |acc, list|
if list.empty?
acc
else
fold_with_acc.call(binary_function.call(acc, list.first), list[1..-1])
end
}
lambda { |list|
fold_with_acc.call(default_value, list)
}
}
Now we can do the following:
adder = fold_coder.call(0, lambda { |a, b| a + b })
adder.call([1, 2, 3, 4, 5])
adder.call([2, 4, 6, 8, 10])
No more remembering that addition starts with a default of zero.
Actually, there’s a far simpler way to avoid having to remember the default value when you want to fold over addition. But let’s just play along so that we don’t have to come up with an entirely new set of examples to demonstrate the value of functions as first-class values.
Functional programmers (as opposed to the rest of us
dysfunctional programmers) will recognize this as
currying our
folder function. Currying is when a function takes more than one parameter and you combine one of the parameters and the function to produce a function that takes fewer parameters.
Here’s a currying function in Ruby:
curry = lambda { |fn,*a|
lambda { |*b|
fn.call(*(a + b))
}
}
(
This is an improvement on an earlier version, thanks to Justin's comment.)
So you can use our new function to create an increment function out of our adder and a treble function out of our multiplier:
plus_one = curry.call(add_two_integers, 1)
times_three = curry.call(multiply_two_integers, 3)
If you are ever asked, “what good is currying?,” I hope I’ve given you an example you can use to explain why currying matters, and why people do it all the time (possibly without explicitly naming it). Although it doesn’t look like much when looking at trivial examples like functions that multiply by three, it’s much more useful when creating folders and mappers where you want some of the parameters to remain constant.
CompositionOur examples combined functions and non-functions to create new functions. Here’s an example from a recent post,
Don’t Overthink FizzBuzz, where I give a method for
composing two functions. The idea is that if you have multiple functions that each take one argument, you can combine them using
compose. I also have a method that generates functions,
carbnation:
# fizzbuzz.rb
def compose *lambdas
if lambdas.empty?
lambda { nil }
elsif lambdas.size == 1
lambdas.first
else
lambda do |n|
lambdas.first.call(compose(*lambdas[1..-1]).call(n))
end
end
end
def carbonation(modulus, printable_form)
i = 0
lambda do |n|
(i = (i + 1) % modulus) == 0 && printable_form || n
end
end
print(((1..100).map &compose( carbonation(15, 'FizzBuzz'), carbonation(5, 'Buzz'), carbonation(3, 'Fizz'))).join(' '))
The simple explanation of how it works is that
carbonation generates functions that replace every so many elements of a list with a printable string.
Compose composes any two or more methods together. So if you want to print out 100 numbers, but replace every third number with “Fizz,” every fifth with “Buzz,” and all those that are third
and fifth with “FizzBuzz,” you generate a function for each replacement, compose them together with compose, and then map the numbers from one to one hundred to the resulting überfunction.
When you look at this today, it seems weird and unreadable by Java standards. I wonder if adding first-class functions with simple syntax to Java will lead the Java community to a place where code like this will not appear out of place?
Just one more thingSo we started by saying that people are getting hung up on what makes a closure a closure, and there has been less emphasis on the benefits of using functions as first-class values. Did you notice that our
folder function actually includes a non-trivial closure?
If you look at the
fold_with_acc function, it makes use of
binary_function, a variable from its enclosing lexical scope. This is not possible with the current version of Java: if you translate this to Java, when you make
fold_with_acc and anonymous inner class, you will have to copy
binary_function into a final member to use it. It simply won’t compile if you try an idiom-for-idiom translation, even adding explicit types.
And then if you look at the anonymous function it returns,
lambda { |list| fold_with_acc.call(default_value, list) }, that anonymous function uses
default_value,another variable from the enclosing lexical scope. Once again you will have to fool around with final variables to make this work, or perhaps declare full-fledged object with constructors.
(If you try writing this simple example out in Java, you quickly find yourself inventing a lot of classes or interfaces. And they have some complicated types, like a function taking an integer and a function taking two integers, returning a function taking a list of integers and returning an integer.
After twenty minutes of that, you understand why the
ML and
Haskell communities use
type inference: If the types are that complicated, it’s incredibly helpful to have the compiler check them for you. Yet if the types are that verbose, it’s incredibly painful to write them out by hand. Even if your IDE were to write them for you, they take up half the code, obscuring the meaning.
You also get why the Ruby on Rails community doesn’t care about type checking: types for CRUD applications are way less complicated than types for first-class functional programs.)
That’s InterestingPart of the interest in closures is in simplifying the syntax around functions, and part of the interest is in the way that access to enclosing scope would simplify a lot of code. There’s a whole debate around the value of simplification in a world where all serious languages are
Turing Equivalent.
I hope you’re convinced, by now, that programming languages with first-class functions let you find more opportunities for abstraction, which means your code is smaller, tighter, more reusable, and more scalable.
For me, simpler is just nicer until something reaches a certain tipping point: when it becomes so simple that the accidental complexity of using it goes away, I will start using it without thinking about it. Tail optimization is like that: as long as recursion is slower than iteration and sometimes breaks, I have to think about it too much. But when I’m not burdened with “except…” and “when performance is not a factor…” it becomes natural.
And then something interesting happens. It changes the way I look at problems, and one day I see a whole new way to do something that I never saw before. Functions as first class values are definitely one of those things that change everything.
Further Reading
If this has whet your appetite for more,
Structure and Interpretation of Computer Programs is
the book on higher-order functions and how they can be used as building blocks to create more elaborate abstractions such as object-oriented programming.
The Seasoned Schemer devotes an entire book to the uses of functions. Although the examples are in Scheme, the language is dead simple to learn and the techniques in the book can be applied to Ruby and Java (or at least to a future version of Java where you do not need functors).
The second edition of
Programming Ruby is an indispensable guide. Even if you will not be using Ruby immediately, pick it up and discover why so many people are lauding the language's simple, clean design and powerful Lisp-like underpinnings.
As the author says, “
Higher-Order Perl: Transforming Programs with Programs
is about functional programming techniques in Perl. It’s about how to write functions that can modify and manufacture other functions.
“Why would you want to do that? Because that way your code is more flexible and more reusable. Instead of writing ten similar functions, you write a general pattern or framework that can generate the functions you want; then you generate just the functions you need according to the pattern. The program doesn’t need to know in advance which functions are necessary; it can generate them as needed. Instead of writing the complete program yourself, you get the computer to write it for you.”
It’s worth reading even if you have no intention of using Perl: the ideas span languages, just as SICP is worth reading even if you don’t use Scheme at work. And be sure to read
Higher-Order JavaScript and
Higher-Order Ruby. They translate HOJ’s ideas to other languages.
Notable Follow-ups:Some useful closures, in Ruby: “The
Dylan programming language included four very useful functions for working with closures:
complement,
conjoin,
disjoin and
compose. The names are a bit obscure, but they can each be written in a few lines of Ruby.”
From Functional to Object-Oriented Programming: “OO allows a traceable connection between the conceptual design level and the implementation level. Concepts have names, so you can talk about them, between programmers and architects.”
HOF or OOP? Yes!: “First-class functions are a natural fit with OO, as evidenced by their presence in OO languages that aren’t glorified PDP-11 assemblers with some OO stuff bolted on the side.”
But Y would I want to do a thing like this?: “To truly learn a new tool, you must not just learn the new tool, you must apply it to new kinds of problems. It’s no good learning to replace simple iteration with first-class functions: all you’ve learned is syntax. To really learn first-class functions, you must seek out problems that aren’t easily solved with iteration.”
Labels: java, lispy, popular, ruby
Don't Overthink FizzBuzz
I noticed some traffic coming to my post on
juggling from
Using FizzBuzz to Find Developers who Grok Coding. Like me, the author is having trouble with the fact that
199 out of 200 applicants for every programming job can’t write code at all. I repeat:
they can’t write any code whatsoever.
UPDATE: If you think that I just claimed that 199 out of 200 working programmers cannot code, stop immediately and either read my follow-up explaining the math, or read Joel’s article on why companies aren’t as picky as they think they are. Thank you.So the author has decided to set the bar ridiculously low. As the post says:
I’ve come to discover that people who struggle to code don’t just struggle on big problems, or even smallish problems (i.e. write a implementation of a linked list). They struggle with tiny problems.
He gives an example of a “tiny” problem that is quite suitable for a phone interview:
Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”.
One of the nice things about using a tiny problem is that anyone who has written actual code in the last six months will solve it in less than ten minutes. This is good because (a) it leaves more time for talking about important things, and (b) if the candidate can’t solve the problem you will find out right away and save yourself a lot of grief.
As you can imagine from a post like this, quite a few people have stepped forward with
creative solutions. Here’s mine:
# fizzbuzz.rb
# see http://weblog.raganwald.com/2007/01/closures-and-higher-order-functions.html
def compose *lambdas
if lambdas.empty?
lambda { nil }
elsif lambdas.size == 1
lambdas.first
else
lambda do |n|
lambdas.first.call(compose(*lambdas[1..-1]).call(n))
end
end
end
def carbonation(modulus, printable_form)
i = 0
lambda do |n|
(i = (i + 1) % modulus) == 0 && printable_form || n
end
end
print(((1..100).map
&compose(
carbonation(15, 'FizzBuzz'),
carbonation(5, 'Buzz'),
carbonation(3, 'Fizz'))).join(' '))
Right away I think you want to be careful if you ever trot a solution like this out in an
actual interview. Here are some of the things that could go wrong:
- The interviewer merely wanted to go through the motions of demonstrating you could program. Now she’s going to have to get into a discussion with you about how your solution works, which is irritating and wastes time that could have been invested more profitably on a design question;
- The interviewer might presume that you will always find the most indirect route to the destination and decide you belong in an ivory tower;
- The interviewer might not understand your solution at first glance but will make up for this by “bikeshedding” and grilling you about some minor nit like the failure modes of using
&& and || instead of the if statement or the ternary operator;
- The interviewer might have just finished The Seasoned Schemer and grill you over why your solution doesn’t use the Y Combinator to implement
compose (snap!).
Of course, some interviewers are just looking for an excuse to talk about technical issues and will not prejudge you in any way. Speaking for myself, I would laugh if you wrote something like this out. But then again, I once worked for someone who had won the second
International Obfuscated C Contest.
Update:
Don’t Overthink the Interview EitherIf you are conducting an interview and you receive a cryptic answer, or an impossibly concise one-liner, or perhaps something like the above that is redolent of Higher Order Functions, do you embrace the candidate as a genius? Or perhaps avoid them as someone who cannot write clear, maintainable code?
I suggest the best answer is
neither. Stay on track: you asked the question for the purpose of eliminating the 199 out of 200 that have absolutely no hope of ever producing working software. You have just identified that this person is the one in two hundred applicants who can actually code. Mission accomplished. You aren’t trying to get to HIRE or NO HIRE over the phone.
Invite them in for an interview: that’s the place where you can drill down and obtain their views on professional programming. Remember, this is a
toy problem. Unless you load the question down with requirements that their answer resemble production code, why should you have any expectation that the candidate only knows one way to write software?
This is like trying to hire programmers who know both languages, FORTRAN
and COBOL.
If you do ask them for production code, don’t be surprised if the best candidates decide not to work for you: they will rightly point out that FizzBuzz is not a realistic example of a production problem.
Overall I think the best approach as an interviewer is the simplest: pose the simple question. Indicate it is a screener designed to confirm that they are the person described on the resume.
If anything, tell them that there is no need for production artifacts like documentation and unit tests, you just want working code in as little time as they are able. The no-hopers won’t get it no matter how simple the question, and the good people won’t waste a lot of time with JavaDoc and xUnit.
And then make your decision based on one objective fact:
does the code run and produce the desired output. Everything else is superfluous to the question of whether the candidate should be asked into your office for a face-to-face meeting.
Update:
a response to some of the criticism.
Labels: jobs, lispy, ruby
Giles Bowkett turns language elitism upside down
To say that Lisp is a language of the gods, and Java is a language for mortals, that could in fact be getting everything entirely backwards. Scheme is actually easier to use than Java. So the easier language is the language of the gods, and the hard one is the one for regular people? How does that make sense?
Isn’t the idea that
languages like Scheme and SmallTalk are easier to use than languages like C# and Java very interesting? Is this true? How could this be? The conventional wisdom is that languages like Scheme are
too abstract. How could so-called novices grasp concepts like recursion, mapping, and folding when they have trouble remembering that the
main method must be
static?


The authors of The Little Schemer and The Little MLer bring deep and important insights to the Java language. Every serious Java programmer should own a copy.Could it be that more abstract languages are
smaller languages? With a more abstract language, there are obviously fewer things you need to use to write even complex functions. That’s the expressiveness people talk about: everyone grasps the idea that some languages let you express an algorithm in fewer bits of information. So with these languages, you need to use fewer bits of information to express each idea.
Could it be that these same languages require you to learn fewer bits of information as well?
I am reminded of one of those silly “Men are from Mars,
degrees are from mail-order stores” aphorisms. The idea was that every bit of kindness is worth one point, no more no less. One rose? One point. A dozen roses? The same one point.
We talk about some ideas (like mapping and folding) as if they are intrinsically harder to grasp and learn than other ideas (like iteration). But maybe it doesn’t require more bits to learn? Maybe map is one point and iteration is one point, even though we want to think that map is twelve points and iteration is two points?
If that’s the case, could a language with fewer ideas that can be combined in succinct ways be easier to learn than a language with lots of different weak ideas?
Labels: lispy
Lisp is not the last word
Ken Tilton asked:
What is up the power continuum from Lisp?I don’t have a ready answer. However, just because I don’t have an answer doesn’t mean I don’t believe there’s an answer. It could be that Lisp is a little like Democracy. It could be the least powerful programming language possible, excepting all of the others invented so far. But you know what? I have faith
we can do better.
Ken doesn’t say there isn’t a language up the power continuum. And I won’t say we have already invented one: like Ken, I’ll pose a question:
what law of computer science places a limit on the power continuum at Lisp?
G.J. Chaitin explains his proofs of Kurt Godel’s incompleteness theorem and Alan Turing’s “halting problem” in computation. Chaitin’s creative use of Lisp in mathematics and fervent belief that no theorem is proof against new analysis are welcome shots of espresso.Human history is chock-a-block full of inventions and practices that were considered for decades or even centuries to be the final word, the ultimate expression and implementation of ideas. And then someone came along and demolished everything. Geocentricity. Heliocentricity. Newtonian celestial mechanics. Light as a wave. Light as a particle. Three dimensions. Uniform space. Euclidian geometry.
Some of these new ideas took years to take root while the establishment derided them as “not even wrong.” Others were so obviously right they immediately displaced what had come before. We might now have invented a more powerful language. Or we might have invented one but not realize it yet. But who can say that we haven’t invented a more powerful language and will never do so?
If you believe there
is a power continuum, if you are not so obsessed with Turing Completeness and theoretical equivalence, what is the argument that it has any limit whatsoever, let alone that its limit is Lisp?
I believe that the only language that is affixed to the top of the power continuum is
Blub. For everyone whose imagination soars above the ceiling of their laboratory, Lisp is not the last word.
Labels: lispy, popular
The significance of the meta-circular interpreter
A
self-interpreter is a “programming language interpreter written in the language it interprets.” A
meta-circular interpreter is a special case of a self-interpreter that applies only to programs where the primary representation of the program is a primitive data type in the language itself (this property is called homoiconicity). Lisp is such a language because Lisp programs are lists of symbols and other lists. XSLT is such a language because XSLT programs are written in XML.
(If you have ever written an XSLT that transforms other XSLTs, then you immediately grasp the advantage of a meta-circular interpreter over an ‘ordinary’ self-interpreter: it is not just possible, but it is
easy to write programs that write programs, because you don't have to fiddle with transforming each program into an abstract data structure (typically a tree) that can be manipulated by your program.)
This is interesting to both language theoreticians and hobbyists. But does it matter to those of us trying to get things done? What is the significance of meta-circular interpreters and self-interpreters?
The short answer is that if you are working on meta-programming, a self-interpreter makes all things not just possible, but practical. The lack of such an interpreter places a limit on how much you can accomplish using your implementation language.
Let’s start our examination of the significance of self-interpreters with a review of meta-programming, or bottom-up programming. This is the practice of constructing a programming language tailored to your problem space. You get your language’s basics working by building it on top of a base, or
implementation language, and then you build your solution in your
solution language.

While meta-programming, you are working on two tiers either simultaneously or alternately: you work on expressing your solution in your solution language, and you work on the implementation (whether that be fun stuff like making it expressive or plumbing stuff like making it fast enough to be practical) in your implementation language.
An important and popular sub-domain of meta-programming is the practice of writing
Domain-Specific Languages or DSLs. DSLs are solution languages tailored to resemble as closely as possible the human jargon of “domain experts.”
(
Obie Fernandez has an excellent
presentation on the rationale for and construction of DSLs.)
DSLs are commonly cited as useful in areas where programmers who are not domain experts collaborate with domain experts who are not programmer. In my own experience, DSLs are also valuable even when the programmers are themselves the domain experts. My rule of thumb for whether a DSL would be worthy of consideration is to ask how two programmers discussing the solution to a problem over an IM client would talk. If their language would closely resemble the natural constructs and idioms of their chosen implementation language, there is no need for a DSL.
This is not always the case. A very classic example is that of SQL. SQL is a DSL designed for programmers to express relational algebra rather than the imperative steps for performing queries and updates. Although complex cases are impenetrable to the journeyman, its general form closely follows the way even a non-technical person would express their thoughts about data that is stored in tables.
Another example of a successful DSL for programmers is the regular expression engine present in almost every language (whether baked in or as a commonly available library). Programmers do not discuss searching for text patterns in terms of backtracking and lookup tables and loops unless they are implementing a search library themselves.
Programmers talk about
matching the string ‘Nokia’ followed by four digits, a forward slash, and two numbers separated by a period in the User-Agent Header. Regular expressions, while imperfect, match the way programmers think and talk about text matching much more closely than writing out the most efficient steps using
strstr and
for loops.
Meta-circular interpreters are nothing new. Lisp is most famous for its
meta-circular interpreter. But do you know that one of the world’s most popular programming languages has the next best thing, a self-interpreter? The C programming language’s compiler is written in C. That’s interesting. But why is it significant?
Well, instead of heading up the ‘power continuum’ and talking about Lisp, let’s stay with C for a moment. If you’re a C programmer and you become very, very interested in building a better programming language, you have in your hands the tool to make changes as you see fit.
You might, for example, start with a pre-processor and implement the first version of your C++ language by mechanically translating a C++ program to C. Or you might bootstrap your C++ language by writing a compiler for C++ in C. Because you have in your hands all of the tools for going from source to running program, you can enhance and change its behaviour exactly as you please.
When a language is not implemented in itself, you have limitations on your ability to create new forms. One might argue that Lisp’s macros make it possible to build any other language paradigm on top of Lisp. This is partially correct, however macros alone are not a complete answer. Macros act to rewrite local sections of programs.

You cannot—to my knowledge—take a dialect of Lisp that does not support tail recursion and use macros to execute tail calls in constant space without rewriting every function using your macro. A similar argument holds for using
macros to implement continuations. You must manually rewrite every function using your macros if you want to change the global behaviour of your program.
Manually
blank every
blank. This sounds an awful lot like something we can automate. Automatically transforming entire programs is the province of interpreters and compilers, isn’t it? If you wish to perform transformations that are global in scope, you want a custom interpreter. Of course, you can write one from scratch in your implementation language.
But… This sounds like we are headed towards the
Turing Tar Pit. Isn’t it much easier to use the interpreter that’s built in? Implementation languages that provide an interpreter or compiler for themselves provide an industrial-strength, debugged platform for the construction of solution languages.
This is the other half of the power of Lisp: if you want to change deeper fundamental language features like whether you have a Lisp-1 or a Lisp-n, or whether all evaluation is lazy, or…, or…, or… you can, because Lisp interprets Lisp and Lisp compiles Lisp.
It is this reason that languages like Ruby, which is implemented mainly in C, provide less maximum power than languages like
Smalltalk, which is
implemented mainly in… Smalltalk. For example, there is talk—at this time—that continuations will be
temporarily dropped from Ruby 1.9. If you have an application making heavy use of continuations, what is your upgrade path?
Of course, not everyone
thinks they need all of a more powerful meta-programming implementation. You will have to decide for yourself whether a language without a meta-circular evaluator or self-interpreter offers other benefits that outweigh this significant feature.
Serving the self-interpreter to your canine companion
As Simen pointed out in a
comment, Ruby does have a third-party project to interpret Ruby in Ruby called
Rubinus. Assuming it graduates from its current status as an experimental work-in-progress, how is this different from having a self-interpreter baked into the language?

When a language is built on top of a self-interpreter, the language designers are forced to
eat their own dog food.
Now it is not correct to say that Matz does not use Ruby. He does, and so he does eat his own dog food. And for the domains where he uses Ruby, he has optimized Ruby to be a useful tool. But since he
doesn’t use Ruby to build Ruby, he does not have the same incentive to tune Ruby for the purpose of building languages.
An obvious example is Ruby’s performance. It is perfectly fine for
building CRUD applications. However, it lacks really high performance when working with complex data structures in memory. This is the kind of thing that an interpreter or compiler has to do when parsing code or managing
cactus stacks.
Imagine what would have happened had Matz become impatient waiting for Ruby to interpret itself when he was first building the language? I suggest that the implementation and language would both be tweaked to bring performance up an order of magnitude or more. Programmers are notoriously impatient with slow tools.
The implementation is not the only thing that improves when a language designer eats her own dog food by baking a self-interpreter into the language. The design of the language itself changes. Larry Wall has said that “Languages differ not so much in what they make possible, but in what they make easy.” When a designer builds their new language in itself, the language invariably makes building languages easy.
So I suggest that the presence of a self-interpreter
baked right into the language, not bolted on as an after-thought forces a language to be useful for building solution languages.
(
This is not a broad criticism of Ruby, or a suggestion that Ruby is not an excellent tool for building a wide variety of useful solution languages. I’m just trying to point out the salient distinction between baking a self-interpreter into a language from the beginning and bolting one on the side.)
Of course
you are. So tell me, how does
Eclipse do all of its magic with
Java, a language lacking a self-interpreter?
The answer is
here. The Eclipse team based Eclipse on VisualAge for Smalltalk. They were more than familiar with the benefits of having a self-interpreter, so they wrote most of one in themselves, in Java. Self-interpreters are also the basis for building advanced tools.
Labels: lispy, popular
The first seven books I would buy if my shelves were bare
Lucas Carlson won a $100 gift certificate on
Amazon.com for his 2nd place entry at
Rails Day (in collaboration with John Butler). Congratulations, Lucas!
Lucas asked for suggestions on spending the money. I tried suggesting a new
iPod Nano loaded up with the SICP lectures in
video podcast form. But as you would expect for someone working in a music-related venture, he has plenty of toys already.
So… here are the first seven books I would buy if my shelves were bare (in no particular order):
- Structure and Interpretation of Computer Programs
. You can read it for free on line, but it’s even better as a physical book. One for the ages, it’s the kind of thing that ought to be bound in rich leather (if you go for that sort of thing) and kept in the library you build for your luxury castle.
- To Mock a Mockingbird
. It seems you can’t raise micro-capital these days without understanding fixed point combinators. Here’s the most enjoyable text on the subject of combinatory logic ever written. What other textbook features starlings, kestrels, and other songbirds?
- The Media Lab: Inventing the Future at M. I. T.
. Stewart Brand’s book captures the legendary think tank’s culture and ideas. Compare and contrast their view of broadcatch with today’s RSS feeds, or narrowcasting with today’s 500 channel television.
- On Intelligence
A book that shook my views about how my brain works. To pick one nugget out of many, neurons are so slow that in the time it takes for us to react suddenly—say to duck a flying object—there is only time for a chain of at most 100 steps to complete. 100 steps do not permit us to perform any complex reasoning or look-up. Jeff explains how the neocortex can accomplish complex tasks using layers of parallel switches.
- Philip and Alex’s Guide to Web Publishing. While the technologies suggested (TCL, AOLServer) are unlikely to float your boat, this is the most beautiful technical book on my shelves. Philip’s advice on how to build software for web publishing and approach is still relevant several generations of web developers later. (Also available on line for free.)
- The Recursive Universe: Cosmic Complexity and the Limits of Scientific Knowledge. Although it seems to be out of print, hunt down a copy for yourself. A thrilling journey into the ideas of the great John Horton Conway and computation’s building blocks. Best of all, it’s explained beautifully using the legendary Game of Life. Who knew that puffer trains and spaceships are Turing Complete?
- QED: The Strange Theory of Light and Matter
. Yes, the full Feynman Lectures are incomparable, and for further thrills you can listen to him give the lectures on audiobook. But in QED, Feynman does this one magical thing: he explains how a mirror reflects light. And in the process of explaining how light actually reflects off a mirror, Feynman deconstructs classical physics and rebuilds our understanding with Quantum Electrodynamics. For a moment, you can understand how little we really know about how the universe works.
Is there a book you would
recommend? What’re
your feelings about the books I’ve suggested?
p.s. Shane Sherman’s
The 5 Books that Every Programmer Should ReadLabels: lispy, popular
Irony
Programs must be written for people to read, and only incidentally for machines to execute.

From time to time people quote this as a justification for staying
away from high-level languages like Lisp, Scheme,
Python, and
Ruby, and not only sticking to popular languages but also staying within the lowest-common denominator when choosing idioms within those popular languages.
Isn’t it interesting that the quote is from a book explaining concepts such as recursion and metalinguistic abstraction? In Scheme?
The
full text is on line. If you have yet to read it, I recommend you give it a try.
update: if you enjoyed reading SICP, you may find my list of the first seven books I would buy if my shelves were bare interesting.Labels: lispy, popular
Are we Blub programmers?
From time to time I open my email and find someone asking a question:
What the hell is Blub? I take from context that it may be a pejorative generalist term for programming languages that encourage writing pablum instead of programs. Or maybe it really is a new language out there that has a huge following I'm unaware of.
Chalain, the "
if you're not having fun, you're doing it wrong" guy

Actually, Blub is a hypothetical programming language Paul Graham invented when describing something very interesting: the
Blub Paradox:
Blub falls right in the middle of the abstractness continuum... As long as our hypothetical Blub programmer is looking down the power continuum, he knows he's looking down. Languages less powerful than Blub are obviously less powerful, because they're missing some feature he's used to. But when our hypothetical Blub programmer looks in the other direction, up the power continuum, he doesn't realize he's looking up. What he sees are merely weird languages... Blub is good enough for him, because he thinks in Blub.
The interesting things about this paradox is that almost
any language could be Blub. The pre-requisites for a language being Blub are (a) there is at least one language less powerful than Blub, (b) there is at least one language
more powerful than Blub, and (c) that there be at least one programmer using Blub who accepts (a) but refutes (b) because he or she cannot see how the more powerful language is more powerful. She does not think in the
idioms that the more powerful language affords.

The Little MLer introduces ML (and Ocaml) through a series of entertaining and straightforward exercises leading up to the construction of the Y Combinator.
ML and OCaml introduce powerful strong typing and type inference. Both are great languages to learn: you will stretch your understanding of defining types and writing correct programs.
When
I use the term, I am thinking of the language and also the programmers around it. Could Java be Blub? Sometimes, possibly often, but only when I'm thinking about Java programmers who dismiss Ruby's features as unnecessary.
Could Ruby be Blub? Sometimes, but only when I'm thinking about Ruby programmers who dismiss macros as unimportant.
Could Lisp be Blub? I suspect that Erlang and Haskell programmers might say that it is, provided we can find a Lisp programmer who feels that all progress in programming languages stopped when Common Lisp was standardized.
At the same time, Java is not Blub when I am thinking of programmers
who are perfectly aware of its shortcomings and deliberately Greenspun around them for pragmatic reasons. The same goes for any other language: it is sometimes Blub, and sometimes not Blub.
So... I use the term "Blub" to refer to a programming language in the context of intransigent programmers who feel that their chosen tool is the best tool possible.
Programming consists of overcoming two things: accidental difficulties, things which are difficult because you happen to be using inadequate programming tools, and things which are actually difficult, which no programming tool or language is going to solve.
This provokes a very obvious question:
How do we know which things are accidentally difficult and which are actually difficult? Is it only because we haven't discovered the right tool yet?
It's easy to find a Java programmer who believes that all of the Design Patterns in the GoF's book are necessary. She believes that the difficulties of applying those patterns are actual difficulties of programming systems. It is only when she learns a different language that she realizes how the patterns were strongly driven by limitations in Java's object model.
At that point she has an epiphany and understands that what she thought were actual difficulties were merely accidental difficulties. And the line between "accidental" and "actual" moves for her.
No matter how much each us us thinks we know right now, are we nevertheless like this Java programmer, unable to see the difference between accidental and actual differences because we simply haven't discovered a more powerful tool?
Are we Blub programmers?
update: In praise of informed choices and Lisp is not the last wordLabels: java, lispy, popular
Dynamic is the opposite of Static, not of Explicit
I recently read a post where the author said he does not care for Ruby's dynamic typing.
When people start talking about types in programming languages, the terms fly around in a rather fast and loose manner. Here's a rather extensive and balanced discussion: the article on
Type Systems in Wikipedia.
Notice that Ruby is
dynamically typed (it's types are determined at run time and can change at run time),
strongly typed (it does not allow an operation to succeed on arguments which have the wrong type), and
type-safe (it does not allow operations or conversions which lead to erronous conditions).

(There's another school of thought about how to name the properties of type systems. More on that at the bottom.)
Proponents of
explicit types may say that Ruby is not type safe because it is possible to compile programs that contain errors which will not be detected until the program is run and the erroneous condition is detected.
Statically typed languages such as C# can detect a class of such errors at compile time. Some Ruby enthusiasts argue that they do not like the boilerplate associated with explicit typing and feel that the extra error checking does not outweigh the additional overhead.
Both arguments are mistaken. Sorry about that. If Explicit and Implicit were the issues, we could add
Type Inference to Ruby and it would have the low overhead of implicit types as well as some extra error-checking (work on Soft Typing for languages like Scheme and Erlang aims to provide compile-time checking for implicitly typed languages). However, type inference is a feature of
statically typed languages.
The trouble is that Ruby is
dynamically typed. Specifically, Ruby contains extensive
dynamic meta-programming constructs. Type inference works with statically typed languages. The compiler must be able to infer the type or set of types possible for any variable.
Consider a Ruby application that modifies classes and objects at run time. The simplest example is Ruby's built-in accessor methods:
attr_accessor :foo is a class method that actually creates two instance methods at run time,
foo and
foo=. What happens when attr_accessor is called with a
variable as its argument, like
attr_accessor my_attribute?
If the type inference engine later looks at a call such as
bar.blitz = 'bash', how does it know whether attribute_accessor was ever called with 'blitz' or :blitz as an argument? Dynamic meta-programming makes the type inference problem undecidable.
A lucid argument is that Ruby's dynamic typing makes it difficult to detect type errors statically. The
correct counter-argument is that Ruby's style of dynamic typing makes it possible to use dynamic meta-programming, such as
Ruby on Rails' ActiveRecord, not that Ruby's implicit typing is more productive.
Once you've made these arguments, you can decide for yourself whether the benefits of dynamic meta-programming do or do not outweigh the advantages of static type checking.
(Updated to use the expression "dynamic meta-programming" to distinguish features like
define_method from static meta-programming features like macro systems, courtesy of the well-informed
Lambda the Ultimate).
fin
p.s. Ward's Wiki seems to use the term static typing to mean the same thing as explicit and strong typing. I prefer that the term static means that it doesn't change. The same wiki seems to use the term dynamic typing to mean the same thing as implicit and strong typing. It includes the possibility that a program is what I would call strongly, statically and implicitly typed: the checking is done at run time but the types of variables never change. The commentary suggests the phrase semantic dynamic typing for the quality of dynamic typing that encompasses dynamic meta-programming. If you like these terms, please substitute "semantic dynamic typing" for "dynamic typing" above.p.p.s. And there's another argument that dynamic meta-programming, like macros, should be considered harmful. I'm saving my reply for a rainy day.Labels: lispy, ruby
(eq? 'lisp 0)
I just
read (via Jim Loy's
math pages) that it took Europe approximately
five centuries to adopt
zero and the Hindu-Arabic numbering system. ("from about
twelve hundred when Fibonacci introduced Hindu-Arabic numbers to Europe, to about
seventeen hundred when they became popular").
So...
Lisp has only been around since
nineteen fifty-nine or so,
not even five decades. We may have to wait another
four hundred and fifty-four years before every programming language embraces program/data equivalence, bottom-up programming, functional programming, macros, and even parentheses.
As Jim points out, many people exposed to the new system may not have seen the advantages. Their existing zero-less numbering systems suited their existing needs just fine. This argument struck me as familiar.
Plus ça change... (plus c'est la même chose).
Labels: lispy
Repost: Closures in Ruby
(I wrote the original version of this page in 2002. I've made a few minor edits and added a comparison with Java's anonymous inner classes)
I briefly worked with a team that used Perl to implement high availability web applications. When discussing the language with the team’s technical lead, I pointed out that I was impressed with the fact that Perl implemented closures. Having written a Scheme interpreter, I considered closures a fundamental component of modelling procedures.
This led to a discussion of what was a closure, and what was it good for?
A closure encapsulates the execution of a one or more operations for side effects and/or the return of a value in the environment of the function’s definition where the closure was created.
From this definition, all functions, procedures, and methods in languages such as Java and Visual Basic are closures. When a programmer refers to a language as implementing closures, (s)he is really saying that the language permits the creation of arbitrary closures at run time. Scheme aficionados would say that languages like Perl, Lisp, and Ruby support
first class closures: closures can be arbitrarily created and assigned as values to variables or returned from functions.Since contemporary programming languages are lexically scoped, the environment of the function refers to the variables in scope at the time the function is defined. This includes temporary variables, variables that are normally created on some sort of stack and discarded when they “go out of scope.” When a closure is created, variables in scope must be preserved until the closure itself ceases to exist.
Here’s a Ruby closure demonstrating the fact that it ‘captures’ a variable in the scope of its definition:
def makeCounter
var = 0
lambda do
var +=1
end
end
c1 = makeCounter
c1.call
c1.call
c1.call
c2 = makeCounter
puts "c1 = #{c1.call}, c2 = #{c2.call}"
The two important things from this example are:
Although var is no longer in scope once makeCounter returns, Ruby saves it for use in the closure.
Each invocation of makeCounter creates a different var. The two counters do not interfere with each other.
What can you do with closures? Here’s something a bit more useful, a call-by-need thunk factory:
def delay(&procToDelay)
value = nil
return lambda do
if value.nil?
value = procToDelay.call()
else
value
end
end
end
def force(thunk)
thunk.call()
end
foo = delay do
puts "thinking about foo"
"fu"
end
bar = delay do
puts "thinking about bar"
"british american racing"
end
puts force foo
puts force bar
puts force foo
puts force bar
In this example, you have a simple facility for memoizing closures: they can be called repeatedly, but they only evaluate their operations once (provided the retun value is not nil). Obviously, this should not be combined with the previous example: call-by-need thunks are useful when there are no side effects of their evaluation.
Why Java's Anonymous Inner Classes do not implement closures
At first glance, an anonymous inner class in Java looks like it captures an environment. It has access to its enclosing instance's members. That looks an awful lot like the way a closure captures its environment.
But an anonymous inner class cannot access method variables or parameters. This is a crippling limitation. Consider:
interface Transformer {
int transform (int what);
}
class TransformerConstructionKit {
public static Transformer makeMultiplier (int timesWhat) {
return new Transformer () {
public int transform (int what) {
return what * timesWhat;
}
};
}
}
This is illegal in Java for some reason. Ok, I know what the reason is. But I don't have to like it, do I?
Labels: lispy, ruby
Page Thirteen
In
a recent interview posted on ACM Queue, Alan Kay described a starting revelation:
Stuart Feldman If nothing else, Lisp was carefully defined in terms of Lisp.
Alan Kay Yes, that was the big revelation to me when I was in graduate school when I finally understood that the half page of code on the bottom of page 13 of the Lisp 1.5 manual was Lisp in itself. These were Maxwells Equations of Software! This is the whole world of programming in a few lines that I can put my hand over.
Here's the page in question:

Defining Lisp in terms of Lisp is something of a tradition. One of the best books I've ever read about programming is
Lisp in Small Pieces by Christian Queinnec:

Christian explains fundamental elements of computer science by implementing each element in a series of interpreters and compilers.
I'm tempted to try to sum this post up with a heavy conclusion, to try to derive some deep meaning from this. But you know, if you see the world the way I do, the meaning is obvious, I'd just be preaching to the choir.
I'll take the pretentious way out and close with an ancient quote:
Do not walk in the footsteps of the sages. Instead, seek what they sought.
Labels: lispy