raganwald
Friday, November 09, 2007
  Really useful anamorphisms in Ruby
Really simple anamorphisms in Ruby introduced a very simple unfold. Its chief characteristics were that it generated an Array from a value of some sort, and it did so by applying an incrementor block to its seed recursively until it generated nil. For example:

10.class.unfold(&:superclass)
=> [Fixnum, Integer, Numeric, Object]
A very simple modification allows us to separate the two blocks with a :while or :to pseudo-keyword, and to add a :map keyword for transforming the state into the desired result. Thus, this really simple unfold:

1.unfold(&'_+1 unless _==10').map(&'**2')
=> [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
Can also be expressed as:

1.unfold(:to => '==10', :map => '**2', &'_+1')
=> [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
The latter form is helpful once unfolds become larger and more complex than these simple one-liners.

(There is another style of writing :unfold, using method chaining and lazy evaluation to eliminate lambda keywords, but we will save that for another time: it is a great examination of syntax but does not change :unfold’s fundamental behaviour.)

Let’s turn it up a notch

These trivial examples are not particularly compelling. Unfold is touted as the complement to :inject. So you would expect :unfold to be as useful as :inject. And :inject is very, very useful—you “reduce” lists of things to values all the time.

But how often do you need to turn a value into a list? How often do you need to turn ‘10’ into ‘[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]’? And if you do, what’s wrong with using (1..10).map(&'**2')?

Remember that :unfold can be applied to objects with a lot more information to them. The thing that had me stuck when I first saw :unfold was thinking of it as the opposite of :inject. Or at least, the opposite of how I used :inject. I tended to use :inject in a way that reduced information. For example:

[7, 6, 10, 3, 9, 4, 8, 5, 2, 1].inject(&'+')
This gives us the sum of the numbers from one to ten, as it happens. It also gives us a value that is considerably simpler than the list we used to generate the number. Information is lost when we use :inject to “reduce” a list to a very simple value. So my first reaction to :unfold was to think of ways to use :unfold on very simple values, like numerics.

But :unfold doesn’t have to work with simple values. It can work with arbitraily complex data structures. Consider:

def zip(*lists)
lists.unfold(
:while => '.first',
:map => '.map(&".first")',
&'_.reject(&".length < 2").map(&"[1..-1]")')
end
Zip is a function that takes two (or more, but let’s just say two for now) lists, and produces a list of pairs of items. So:

zip([:a, :b, :c], [1, 2, 3])
=> [[:a, 1], [:b, 2], [:c, 3]]
How does :unfold do it? First, of course, it makes a single list of lists. It then performs an unfold on this single data structure. The incrementor successively reduces each sublist by removing the first items. So the output of the successive incrementor operations is:

[
[[:a, :b, :c], [1, 2, 3]],
[[b, :c], [2, 3]]
[[:c], [3]]
]
The :map then extracts the first items from each sublist and presents them as a list:

[
[:a, 1],
[:b, 2],
[:c, 3]
]
Neat. But why do we care about zip? Well, if you’ll notice, we already have a bunch of really useful things we can do with lists, like :map, :select, :reject, :detect, and so on. What would you do if you had two lists and needed to do something with each pair in the list, like… A list of first names and surnames that need to be catenated together?

zip(first_names, surnames).map(&'"#{_[0]} #{_[1]}"')
Zip is useful when we have a bunch of parallel lists and there’s something we want to do with each tuple from the lists.

Generalized iteration

We recognize this “pattern,” it’s one of the most powerful in programming. Zip was one algorithm, a way of iterating over several lists simultaneously. The other algorithm was "#{_[0]} #{_[1]}", a recipe for what to do with the successive tuples of values.




The Ruby Way is the perfect second Ruby book for serious programmers. The Ruby Way contains more than four hundred examples explaining how to do everything from distribute Ruby with Rinda to functional programming techniques just like these.

The powerful idea was to separate the mechanics of turning a data structure into a linear series of values—iterating—from what we want to actually do with each value. (In OO-style programming, we would define a method for lists of lists that returns an iterator over the tuples of values. Same thing, proving that how you do it is not as important as understanding why you do it.)

Unfold has other uses, but this one alone is worth the trouble to understand the pattern even if you aren’t rushing to implement this exact unfold method: Converting a single data structure to a list is one way to implement iteration: for any data structure, you can use unfold to define a linear iteration. You can then use :each or :map or :inject just as our parents before us would have used DO or FOR loops.

Consider this (inelegant, but I’m writing this rather late at night) unfold:

[[1, 2, 3], [4, 5, 6, [7, 8]], 9, 10].unfold(
:while => '.first',
:map => lambda { |first|
first = first.first while first.kind_of?(Array)
first
}
) { |state|
state = state.first + state[1..-1] while state.first.kind_of?(Array)
state[1..-1]
}
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
This is the same idea, only we convert a tree into a list representing a depth-first search of a simple tree. You may recognize it as Array’s :flatten method. Once again, it’s really a way of iterating over the elements of a tree. So one way to think of this :unfold is that it is an iterator over a tree’s leaves:

def flatten(arr)
arr.unfold(
:while => '.first',
:map => lambda { |first|
first = first.first while first.kind_of?(Array)
first
}
) { |state|
state = state.first + state[1..-1] while state.first.kind_of?(Array)
state[1..-1]
}
end
But I already know how to write Zip and Flatten methods, honest I do.

Zip and Flatten are relatively common, that’s why :flatten and :zip can both be found in Ruby’s standard Array class. And if there’s a data structure that needs regular unfolding, you ought to weigh the advantages and disadvantages of writing an :unfold for it or using more humdrum ways of writing an iterator.




The Haskell School of Expression is a terrific and relatively jargon-free introduction to the language that popularized fold, unfold, and all of the other functional programming idioms. As Eric Kidd says, it will make your head explode. Recommended!

However, what do you do when you only need to unfold something once? For example, perhaps you have code that obtains some data in JSON format, and having used a library to parse the JSON into a one-off list or hash, you want to iterate through it.

With unfold, you can write your one-time, specific iterator right in place. This is no different than using blocks and lambdas in Ruby for one-off functions that really don’t need the cermony and weight of being implemented as methods.

When you want to iterate through something, and you want to separate the mechanism for iterating through the data from what you do with the data, :unfold should be in your tool box.

Unfold and the bio-sciences. Not really.

I like to think of :unfold like unfolding a protein molecule. When you stare at a data structure, it’s dense, opaque. But you supply an unfold algorithm, and what looked like a messy ball of twine unravels into a long filament made up of simple elements. You can then operate on the simple elements, without getting what you want to do en-snarled in how you iterate over the data structure.

So there you have it. Unfold can be really useful if we see it as a standardized way to write iterators for data structures.


Update: The under-appreciated unfold.

Labels: , ,

 

Wednesday, November 07, 2007
  Really simple anamorphisms in Ruby
Anamorphisms are functions that map from some object to a more complex structure containing the type of the object. They are the dual—a fancy word for complement—of Catamorphisms, functions that map from some complex structure down to a simpler object.

In simpler terms, a Catamorphism is a function like inject: In Ruby, Inject takes a collection and produces something simpler. It is also called fold or reduce in other languages. Inject can do something like produce the sum of a list:

(1..5).inject &'+' => 15

Unfold does the reverse: it takes a single value and turns it into a collection. Now, a proper unfold can be configured with a seed value, a transformation, a stopping predicate, maybe a distinction between states and output values, and even the type of structure you want to create. A proper unfold would even work with lazy lists if it didn’t have a stopping condition.

But sometimes you want something really simple. The state and the output value could be the same thing, eliminating a transformation from state to output. The stopping predicate could simple, it could stop when it reaches nil. And it could always return Arrays. So you could use such a simple unfold whenever you have a seed value and some sort of function (expressed as a block) that returns nil when it has no more values.

For example:

10.unfold { |n| n-1 unless n == 1 }.inspect => [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
10.class.unfold(&:superclass).inspect => [Fixnum, Integer, Numeric, Object]

Hey, what happens if you combine really simple anamorphisms with really simple catamorphisms?

5.unfold(&'_-1 unless _==1').inject(&'*') => 120

Here is the code unfold.rb. (If overloading core classes with too much responsibility is not to your taste, it is trivial to express unfold as a function taking a seed and a block as arguments).

Now that I’ve whet your appetite, here’re Really useful anamorphisms in Ruby. And for another implementation, everything you ever wanted to know about implementing a true unfold in Ruby.

Enjoy!

(Code courtesy of Mobile Commons. Thanks!)


Here’s an interesting email from Hugh Sasse:

I was reading Code Complete 2 last night and Steve McConnell says that he wouldn’t hire a programmer who wrote a recursive factorial function. [I’m not sure that would be such a crime in languages like Lua with proper tail recursion, but still. (I’m probably wrong, now I’ve written that! “Open mouth, insert foot, echo internationally” as they used to say on Fidonet :-))] I thought “that seems a bit harsh”, especially since in real life they’d get one from a library most of the time, anyway.

So, as I have often run out of stack on Ruby, I’ve tried to rewrite unfold as an iterative function. It seems to be working. The script as modified blows up at 5000 on my system with the recursive version, but the iterative version succeeds at 5000.

Thanks for this blog entry. There’s something REALLY nice about this idea, which I can’t put my finger on. Might be something to do with loops terminating when they should, and The Pragmatic Programmers’ article about “cook until done”

http://www.pragprog.com/articles/cook-until-done

Unfold feels like a “do until finished” loop.

class Object
# As above, but iterative, rather than recursive.
def unfold2 &block
result = [self]
x = block.call(self)
while not x.nil?
result.push x
x = block.call(x)
end
return result
end
end

Labels: , ,

 

Saturday, October 27, 2007
  String#to_proc
Breaking news! The irb enhancement gem Utility Belt includes String#to_proc

String#to_proc is an addition to Ruby’s core String class to enable point-free hylomorphisms

I’ll start again. String#to_proc adds a method to Ruby’s core String class to make lots of mapping and reducing operations more compact and easier to read by removing boilerplate and focusing on what is to be done. In many cases, the existing black syntax is just fine. But in a few cases, String#to_proc can make an expression even simpler.

String#to_proc is a port of the String Lambdas from Oliver Steele’s Functional Javascript library. I have modified the syntax to reflect how String#to_proc works in Ruby.

We’ll start with the examples from String Lambdas so you can see what is actually going on. Then we’ll look at how to use the & coercion to make working with arrays really simple.

to_proc creates a function from a string that contains a single expression. This function can then be applied to an argument list, either immediately:

'x+1'.to_proc[2];
→ 3
'x+2*y'.to_proc[2, 3];
→ 8
or (more usefully) later:

square = 'x*x'.to_proc;
square(3);
→ 9
square(4);
→ 16
Explicit parameters

If the string contains a ->, this separates the parameters from the body.

'x y -> x+2*y'.to_proc[2, 3];
→ 8
'y x -> x+2*y'.to_proc[2, 3];
→ 7
Otherwise, if the string contains a _, it’s a unary function and _ is name of the parameter:

'_+1'.to_proc[2];
→ 3
'_*_'.to_proc[3];
→ 9
Implicit parameters

If the string doesn’t specify explicit parameters, they are implicit.

If the string starts with an operator or relation besides -, or ends with an operator or relation, then its implicit arguments are placed at the beginning and/or end:

'*2'.to_proc[2];
→ 4
'/2'.to_proc[4];
→ 2
'2/'.to_proc[4];
→ 0.5
'/'.to_proc[2, 4];
→ 0.5
’.’ counts as a right operator:

'.abs'.to_proc[-1];
→ 1
Otherwise, the variables in the string, in order of occurrence, are its parameters.

'x+1'.to_proc[2];
→ 3
'x*x'.to_proc[3];
→ 9
'x + 2*y'.to_proc[1, 2];
→ 5
'y + 2*x'.to_proc[1, 2];
→ 5
Chaining

Chain -> to create curried functions.

'x y -> x+y'.to_proc[2, 3];
→ 5
'x -> y -> x+y'.to_proc[2][3];
→ 5
plus_two = 'x -> y -> x+y'.to_proc[2];
plus_two[3]
→ 5
Using String#to_proc in Idiomatic Ruby

Ruby on Rails popularized Symbol#to_proc, so much so that it will be part of Ruby 1.9.

If you like:

%w[dsf fgdg fg].map(&:capitalize)
→ ["Dsf", "Fgdg", "Fg"]
then %w[dsf fgdg fg].map(&'.capitalize') isn’t much of an improvement.

But what about doubling every value in a list:

(1..5).map &'*2'
→ [2, 4, 6, 8, 10]
Or folding a list:

(1..5).inject &'+'
→ 15
Or having fun with factorial:

factorial = "(1.._).inject &'*'".to_proc
factorial[5]
→ 120
String#to_proc, in combination with & coercing a value into a proc, lets you write compact maps, injections, selections, detections (and many others!) when you only need a simple expression.

Caveats: String#to_proc uses eval. Cue the chorus of people—pounding away on quad 3Ghz systems—complaining about the performance. You’re an adult. Decide for yourself whether this is an issue. After mankying things about to deduce the parameters, String#to_proc evaluates its expression in a different binding than where you wrote the String. This matters if you include free variables. My thinking is that it ceases to be a simple, easy-to-understand hack and becomes a cyrptic nightmare once you get too fancy.

You know that Voight-Kampff test of yours… did you ever take that test yourself?
—Rachael, Blade Runner


I have been using Functional Javascript for quite some time now, and I use the String Lambdas a lot. However, Ruby and Javascript are very different languages. Once you get out of the browser’s DOM, Javascript is a lot cleaner and more elegant than Ruby. For example, you don’t need to memorize the difference between a block, a lambda, and a proc. Javascript just has functions.

However, Javascript is more verbose: Whereas in Ruby you can write [1, 2, 3].map { |x| x*2 }, if Javascript had a map method for arrays, you would still have to write [1, 2, 3].map(function (x) { return x*2; }). So it’s a big win to make Javascript less verbose: code is easier to read at a glance when you don’t have to wade through jillions of function keywords.

Nevertheless, I still find myself itching for the String Lambdas when I’m writing Ruby code. It may be a matter of questionable taste, but for certain extremely simple expressions, I vastly prefer the point-free style. (-3..3).map &:abs is shorter than (-3..3).map { |x| x.abs }.

It is also cleaner to me. abs is a message, especially in a language like Ruby that supports the sending arbitrary messages named by symbols. Writing (-3..3).map &:abs looks very much like sending the abs message to everything in the list. I don’t need an x in there to tell me that.

Thus, I obviously like (-3..3).map &'.abs'. But I like (1..5).map &'*2' for the same reason. It isn’t just shorter, it hides a temporary variable that really doesn’t mean Jack to me when I’m reading the code. And quite honestly, (1..10).inject { |acc, mem| acc + mem } raises more questions than it answers about what inject does and how it does it. (1..10).inject &'+' gets right down to business for me. I’d prefer that it be called “fold,” but the raw, naked + seems to describe what I want done instead of how I want the computer to do it.

Symbol#to_proc also supports named parameters, either through implication (&'x+y') or with the arrow ('x y -> x*y'). I haven’t thought of a case where that would be a win over using a Ruby block: { |x, y| x*y }.

I’m divided about the underscore notation. It seems like a good compromise for expressions where there is a single parameter and it doesn’t fall on the left or the right side of an expression. Standardizing on an unusual variable name is, I think, a win. Underscore often means a “hole” in an expression or a computation, so it feels like a good fit. I would honestly much rather see something like: &'(1/_)+1' than &'(1/x)+1'. The underscore jumps out in an obvious way, and it wouldn’t be magically clearer to write { |x| (1/x)+1 }.

That being said, I haven’t actually written an underscore expression yet in actual code, so far I’m getting by using the point-free expressions to simplify things and using Ruby blocks for everything else.

RSpec

describe "String to Proc" do

before(:all) do
@one2five = 1..5
end

it "should handle simple arrow notation" do
@one2five.map(&'x -> x + 1').should eql(@one2five.map { |x| x + 1 })
@one2five.map(&'x -> x*x').should eql(@one2five.map { |x| x*x })
@one2five.inject(&'x y -> x*y').should eql(@one2five.inject { |x,y| x*y })
'x y -> x**y'.to_proc()[2,3].should eql(lambda { |x,y| x**y }[2,3])
'y x -> x**y'.to_proc()[2,3].should eql(lambda { |y,x| x**y }[2,3])
end

it "should handle chained arrows" do
'x -> y -> x**y'.to_proc()[2][3].should eql(lambda { |x| lambda { |y| x**y } }[2][3])
'x -> y z -> y**(z-x)'.to_proc()[1][2,3].should eql(lambda { |x| lambda { |y,z| y**(z-x) } }[1][2,3])
end

it "should handle the default parameter" do
@one2five.map(&'2**_/2').should eql(@one2five.map { |x| 2**x/2 })
@one2five.select(&'_%2==0').should eql(@one2five.select { |x| x%2==0 })
end

it "should handle point-free notation" do
@one2five.inject(&'*').should eql(@one2five.inject { |mem, var| mem * var })
@one2five.select(&'>2').should eql(@one2five.select { |x| x>2 })
@one2five.select(&'2<').should eql(@one2five.select { |x| 2<x })
@one2five.map(&'2*').should eql(@one2five.map { |x| 2*x })
(-3..3).map(&'.abs').should eql((-3..3).map { |x| x.abs })
end

it "should handle implied parameters as best it can" do
@one2five.inject(&'x*y').should eql(@one2five.inject(&'*'))
'x**y'.to_proc()[2,3].should eql(8)
'y**x'.to_proc()[2,3].should eql(8)
end

end
Go ahead, download the source code for yourself.

Update: Reg smacks himself in the head!

I had a look at the source code for Symbol#to_proc:

class Symbol
# Turns the symbol into a simple proc, which is especially useful for enumerations. Examples:
#
# # The same as people.collect { |p| p.name }
# people.collect(&:name)
#
# # The same as people.select { |p| p.manager? }.collect { |p| p.salary }
# people.select(&:manager?).collect(&:salary)
def to_proc
Proc.new { |*args| args.shift.__send__(self, *args) }
end
end
Look at that: Although the examples are all of unary messages like .name, the lambdas created handle methods with arguments. And since almost everything in Ruby is a method, including operators like +… You can use Symbol#to_proc to do some of the point-free stuff I like:

[1, 2, 3, 4, 5].inject(&:+)
→ 15
[{ :foo => 1 }, { :bar => 2 }, { :blitz => 3 }].inject &:merge
→ {:foo=>1, :bar=>2, :blitz=>3}

Labels: , ,

 

Thursday, October 25, 2007
  Too much of a good thing: not all functions should be object methods
OOP is several different ideas put together, the most important of which is Fine-Grained Information Hiding.

One can think of information hiding as being the principle and encapsulation being the technique. A software module hides information by encapsulating the information into a module or other construct which presents an interface.
Information Hiding on Wikipedia


The basic principle of all OO languages is that relatively small things—such as individual accounts in a business program—each encapsulate both their data (in the form of members) and their algorithms (in the form of methods). Our notions of members and polymorphism both work to this goal of hiding information. There’s a lot more to most OO languages, such as whether they include a notion of types and what mechanisms they use for sharing common behaviour. But let’s look at this one principle: objects are responsible for their data and for their algorithms.

should objects be responsible for all of their own behaviour?

There’s a general idea that in a well-constructed program, each object “knows” how it ought to behave. That’s what its methods are for. Quite obviously, objects cannot be responsible for everything involving them in a program. If each object completely encapsulated all of the things it could do or be involved in, you would never pass one object as a parameter in a message to another object.

For every complex problem, there is a solution that is simple, neat, and wrong.
—H. L. Mencken


For example, you would never have collections. If every object “knew” how to organize itself into collections, you wouldn’t need an Array or Hash, would you? In practice, each object in a system can be involved in many different actions. It has to be responsible for some of them, and it has to play a secondary, passive role in others. Most OO programs do not have every object implement its own collections methods. They may include some form of specialization so you can have an array of accounts, but an array of accounts is still not an account.

subject.verb(object)

In the English language, we have the idea of a Subject and an Object in a sentence. For example, when we say “Jack loves Jill,” Jack is a subject and Jill is an object. Jack loves. Jill is loved. It’s the same in OO programs. Sometimes objects are actively doing things through their methods. Sometimes other object’s methods are doing things with them.

Verbing Weirds Language
—Bill Watterson, Calvin and Hobbes


Good OO design is, in part, doing a good job of choosing the right bifurcations: given a list of nouns and verbs, making the right decisions about which nouns ought to be the active nouns, the subjects, the ones that “own” the verb in the form of a method. And thus consciously making decisions about which objects ought to be the passive nouns, the objects of the verbs, the ones that don’t implement the methods.

Unfortunately, there are lots of places where we can err on the side of giving too much responsibility to individual objects. It’s understandable, given that OO is theoretically all about objects being responsible for themselves. But as in many other things, in practice good OO is about objects being responsible for a little as possible (but no less!), not as much as possible.

the kingdom of nouns

One common symptom of this problem is a system that has objects for all of the obvious nouns or entities, but not for the verbs. OO began with languages like Simula, where the paradigm was trying to represent real-world entities such as automobiles on a highway. From that time forward, the emphasis has been on having objects for each noun in the problem domain. In such traditionally-organized OO programs, the “verbs” or actions are all attached to objects as methods.




Object Design: Roles, Responsibilities, and Collaborations focuses on the practice of designing objects as integral members of a community where each object has specific roles and responsibilities. The authors present the latest practices and techniques of Responsibility-Driven Design and show how you can apply them as you develop modern object-based applications.


Not all “verbs” have a clear separation between a single entity that is the subject or active entity that ought to own the verb’s definition and the secondary, passive subject entities that should not own the verb’s definition. The easiest examples of this are operations that are intended to be commutative.

For example, many languages define addition as a method belonging to numbers or magnitudes. In Smalltalk, the expression 1 + 2 actually means “send the message + 2 to the object 1.” At first glance, this seems elegant: the number 1 handles the message + 2 as integer addition, while 1.0 would handle the same message with floating point arithmetic. What more could you want?

Well, there is a huge problem with this arrangement: Addition is commutative. 1.0 + 2 must give the same result as 2 + 1.0. Using a simple message to implement addition means that you must be excruciatingly careful to handle all of the possible cases so that you do not accidentally violate this property. Now of course, the designers of system classes like Integer and Float went to this trouble. But if you want to add another magnitude class—say CurrencytwoPlaceDecimal—you have to open up all of the system classes and modify them so that 1 + ThirtyCents gives the same result as ThirtyCents + 1.

beware of breaking symmetry

Of course, you may not need to implement a new magnitude class. Fine. But what about symmetric relations like comparison? This is a major pitfall for OO developers: in many cases you need to write a test of equivalence or equality (operations like ==, equal?, eql?, eqv? and all of the other variations on the same theme). In every one of these cases, horrible things will happen if your operation is not symmetric. For every case, x.eql?(y) if-and-only-if y.eql?(x).

This is obviously easy when x and y are both the same kind of object. What happens when they’re different, but still logically equivalent? It turns out that implementing commutative operations and symmetric relations as methods doesn’t work very well. It forces you to smear duplicate logic over many different classes (or prototypes, if your language swings that way).

Here’s a practical example. Let’s say you want to implement a form of equivalence for collections. For ordered collections like lists, what you want is that if two ordered collections have the same members, in the same order, they are equivalent. It’s easy to imagine writing such a method as a mixin for all of your ordered collections. It obviously knows about iterating over ordered collections (recursively, if you grew up with Godel, Escher, Bach on your night stand). Note that you may not have an indexed collection: you might have a list where you simply retrieve values in order.

And likewise, you can write a collection equivalence method for dictionaries like hash tables: if two objects have the same values at the same keys, they are equivalent. Again, a simple mixin will handle things for dictionaries.

Now comes the wrinkle: you decide that an ordered collection ought to be equivalent to a dictionary where the keys are the integers ascending from zero. In other words, ('foo' 'bar' 'blitz') ought to be equivalent to { 0 => 'foo', 1 => 'bar', 2 => 'blitz' }. How are you going to code this? Well, the dictionary mixin could obviously handle equivalence to an ordered list. But we need symmetry, so we have to “open up” the ordered collection mixin and add code for equivalence to dictionaries.

Actually I made up the term object-oriented and I can tell you I did not have C++ in mind. The important thing here is that I have many of the same feelings about Smalltalk.
—Alan Kay


I’m holding my nose, we have not one but two different code smells: 1) Why is one piece of logic in two different places? 2) Why do ordered collections know anything at all about dictionaries, and why do dictionaries know anything at all about ordered collections? The latter is especially disturbing: the whole point of OO is information hiding. How does having ordered collections and dictionaries knowing about each other help us to hide information?

The obvious answer to me is that the knowledge of how to compare an ordered collection to a dictionary does not belong in ordered collections or in dictionaries. The requirement that relations like equivalence be symmetrical across heterogeneous types implies that the types themselves cannot be responsible for implementing equivalence for themselves.

There are similar problems of code duplication and information leakage apply to modelling relations (why do we declare has_one and belongs_to in Rails) and implementing the <=> operator in Ruby. It looks like having verbs “belong to” the subject noun is often a good idea, but not always a good idea.

commuting the sentence of execution

Maybe some verbs belong to objects, but some are best on their own? Maybe + and <=> and equivalent? really ought to be emancipated from their subservience to objects and ought to have their own definitions.

There are two real approaches to object-orientation. The first is known as message-passing. You send an object a message and ask it to deal with it. (This would not work with many people in this newsgroup.) The meaning of the message is local to the object, which inherits it from the class of which it is an instance, which may inherit it from superclasses of that class…

The second approach is generic functions. A generic function has one definition of its semantics, its argument list, and is only specialized on particular types of arguments.
Erik Naggum discussing CLOS on comp.lang.lisp


What we ought to do is take some of the verbs and give them their own place in our programs, instead of hanging them off nouns. This isn’t such a revolutionary idea: Common Lisp’s Metaobject Protocol does this exact thing, providing generic functions. A generic function is, in effect, a verb raised to the same level of abstraction as a noun.

This isn’t some revolutionary idea limited to “powerful” languages either: the Java collections framework uses a Comparable interface for ordering collections. The compareTo(...) method belongs to an object. By way of—ahem—comparison, the Comparator interface extracts comparison out of the subject object and puts it in a separate function object. You can perform sorts in Java either way.

If we aren’t using Common Lisp, can we build the verbs we want out of the tools at our disposal? In other words, can we Greenspun generic functions in languages like Java and Ruby?

generic functions in java, plus a detailed look at method dispatching

Let’s start by thinking about generic functions in a Java-like language.

Returning to our example of writing equivalent?, we might make an Equivalent class with a single method, perhaps we can call it eval. So we end up with something like Eqivalent.eval(foo, bar). Java-like languages allow us to write different versions of the eval method with different type signatures, so we can write:

public static boolean eval (List foo, List bar) { ... }
public static boolean eval (List foo, Map bar) { ... }
public static boolean eval (Map foo, List bar) { return eval(bar, foo); }
public static boolean eval (Map foo, Map bar) { ... }

And so forth. Are we done?

No, our code is broken. What happens when we decide that the “default” equivalence is the == relationship. We can’t write:

public static boolean eval (Object foo, Object bar) { return foo == bar; }

This is hideously broken in languages like Java. You’re almost all nodding in agreement, but please be patient while I explain it anyway: you probably want to pass this along to someone who really needs to be told why it is broken, so why don’t I go ahead and explain it for them?

What you want is that if two objects are of the more specific types—List and Map—we will call the more specific version of the eval methods. But if we can’t “match” one of the more specific eval methods, we want to use eval (Object foo, Object bar). Too bad, that’s not how Java works. Java uses two completely different ways to figure out which method to call when you overload methods!

Way number one is is for figuring out that when you call noun.verb(...), where do we find the definition for verb? This lookup is effectively done at run time, so that even if your code looks like this:

public static void printSomething(Object foo) {
System.out.println(foo.toString());
}

Java will look up the method toString based on foo’s actual type when the method is called, even though you declared it to be an Object. That’s polymorphism at work, and it’s the information hiding working for us. Each object can do it’s own thing where toString is concerned, and we don’t have to worry about it. This is called single dispatch, because it figures out which method to call based on just one of the nouns, the subject noun a/k/a the receiver of the method invocation.

But that’s not what happens when we write this:

public static void printSomethingElse (Object foo, Object bar) {
if (Equivalent.eval(foo, bar))
System.out.println("2 x " + foo);
else System.out.println(foo.toString() + ", " + bar.toString());
}

It will always call eval (Object foo, Object bar). It will not call eval (List foo, List bar) if you pass it two lists. That’s because although each of our methods have the same name—eval—Java treats them as different methods, and it figures out which one to call based on the declared types of the parameters at compile time, not on the actual types of the parameters’ values at run time.




Free your mind with A Little Java, a Few Patterns: The authors of The Little Schemer and The Little MLer bring deep and important insights to the Java language. Every serious Java programmer should own a copy.



Besides writing a Lisp interpreter in Java, your next best bet for building a generic function the way we want it is to find a way to turn Java’s single dispatch into a multi-dispatch, to dispatch on two nouns, foo and bar.

The good news is this: dispatching at run time on two different types is a well-known problem, and the solution is called double dispatch. The problem with double dispatch is that it moves our equivalence code back into our nouns, and we don’t want that.

The Visitor pattern might be handy: it’s a way to add methods to an object at run time in a language like Java that supposedly doesn’t do that. If we decide that everything to be compared using Equivalent.evalimplements an interface called Visitable, we can build a double dispatch system that doesn’t require putting an equivalent? method in the entities being compared:

interface Visitable {
Object accept(final Visitor visitor);
}

interface Visitor {
Object visit(final Object obj);
Object visit(final List list);
Object visit(final Map map);
}

public class Equivalent {

static boolean list_list (List foo, List bar) { ... }
static boolean list_map (List foo, Map bar) { ... }
static boolean map_map (Map foo, Map bar) { ... }
static boolean object_object (Object foo, Object bar) { ... }

public static boolean eval (final Visitable foo, final Visitable bar) {
return foo.accept(
bar.accept(
new Visitor () {
public Object visit(final Object bar) {
return new Visitor () {
public Object visit(final Object foo) {
return object_object(foo, bar);
}
public Object visit(final List foo) {
return object_object(foo, bar);
}
public Object visit(final Map foo) {
return object_object(foo, bar);
}
}
}
public Object visit(final List bar) {
return new Visitor () {
public Object visit(final Object foo) {
return object_object(foo, bar);
}
public Object visit(final List foo) {
return list_list(foo, bar);
}
public Object visit(final Map foo) {
return list_map(bar, foo);
}
}
}
public Object visit(final Map bar) {
return new Visitor () {
public Object visit(final Object foo) {
return object_object(foo, bar);
}
public Object visit(final List foo) {
return list_map(foo, bar);
}
public Object visit(final Map foo) {
return map_map(foo, bar);
}
}
}
}
)
)
}
}

If that looks like a lot of work to you, I agree. You’re basically replicating Java’s run time dispatching on two types, so you need a bit of a matrix. Is it worth the effort? Let’s consider what this wins you:


And best of all, you have a nice place for your verbs, and they are no longer second-class citizens behind the nouns.


Update: A few people have suggested alternate approaches to implementing multiple dispatch in Java. I think there are various trade-offs to be made, and several different implementations ought to be considered before you write production code.

However, the point of the article is to suggest that not all functions should be implemented as methods of subject objects. I think it makes that point regardless of what you think of using a Visitor and a double dispatch.

Here’s an alternate approach from Laurie Cheers:
interface Classifiable
{
int classify();
}

// I know this isn't valid Java, but it makes the example much clearer. The alternative is to tiresomely spell out every combination.
#define PAIR(a,b) (a|(b<<4))

abstract class DoubleDispatchable
{
abstract Object list_list(List a, List b);
abstract Object list_map(List a, Map b);
abstract Object map_map(Map a, Map b);
abstract Object object_object(Object a, Object b);

const int OBJECT = 0;
const int LIST = 1;
const int MAP = 2;

Object dispatch(Classifiable a, Classifiable b)
{
switch(PAIR(a.classify(), b.classify))
{
case PAIR(LIST, MAP): return list_map(a,b);
case PAIR(MAP, LIST): return list_map(b,a);
case PAIR(LIST, LIST): return list_list(a,b);
case PAIR(MAP, MAP): return map_map(a,b);
default: return object_object(a,b);
}
}
}

class Equivalent extends DoubleDispatchable
{
Object list_list(List a, List b) {...}
Object list_map(List a, Map b) {...}
Object map_map(Map a, Map b) {...}
Object object_object(Object a, Object b) {...}

bool eval(Classifiable a, Classifiable b) { return dispatch(a,b) != false; }
}

What trade-offs, you ask? The Visitor pattern given gets the compiler to guarantee that you write each of the nine cases, whereas hand-written tests and logic simplifies the code.

I specifically chose the Visitor pattern because it seemed more in keeping with the spirit of the Java language and culture, trading verbosity for compiler safety.

I'm extremely comfortable with the other trade-off, emphasizing readability and simplicity. Although, if you go far enough down that road, you might as well look at other languages ;-)

Labels: , ,

 

Thursday, October 11, 2007
  Three stories about The Tao
That maverick framework author is self-centred and vain. His framework is all about solving his problems, his way, he refuses to look at what the market wants and build something that could be more popular.

There was once a monk who would carry a mirror where ever he went. A priest noticed this one day and thought to himself “This monk must be so preoccupied with the way he looks that he has to carry that mirror all the time. He should not worry about the way he looks on the outside, it’s what’s inside that counts.” So the priest went up to the monk and asked “Why do you always carry that mirror?” thinking for sure this would prove his guilt.

The monk pulled the mirror from his bag and pointed it at the priest. Then he said “I use it in times of trouble. I look into it and it shows me the source of my problems as well as the solution to my problems.”

Sure, the big corporate framework and its language have problems, but they pay the bills.

Once there was a horse tied up on the side of the street. Whenever someone tried to pass, the horse would kick them. Soon a crowd gathered around the horse until a wise man was seen coming close. The people said “This horse will surely kill anyone who tries to pass. What are we going to do?” The wise man looked at the horse, turned and walked down another street.

Those rabid evangelists turned me off with their attitude, so I determined then and there to never look into their stuff, ever.

A monk and his novice were walking through the forest. They come to a stream. On the bank there was a beautifully dressed woman, crying. The monks asked her what was the matter. “I am on my way to a wedding. I have to cross the stream to get there, but the bridge has been washed away. I was searching for a place to cross where I wouldn’t ruin the dress, but I can’t find one and if I don’t make it across soon, I will be late.”

Without a word, the elder monk scooped her into his arms, waded across the stream, and deposited her on the other side. Ignoring her thanks, he waded back and the two monks resume their walk. They continued on their journey, but the younger monk was agitated and obviously had something on his mind. The elder monk stopped and asked him what was the matter.

“Elder, I am confused. Our vows prohibit us from fleshly contact with women, yet you embraced that woman in your arms. How can this be?” The elder monk eyed his novice with kindly concern. “Novice,” he asked, “I left her on the bank of the stream. Why do you still carry her?

I've read or heard these stories (and quite a few more) many times and in many forms. I borrowed the wording for the first and second story from this good page. Also, someone was kind enough to point outthat these stories are not necessarily about the Tao, and how Taoism is different from Zen, and so forth. Just so you know :-)

Labels:

 

Wednesday, October 03, 2007
  Three blog posts I'd love to read (and one that I wouldn't)
“Blogging about blogging” is tiresome, and I apologize in advance for inflicting this on you: I promise to return to technical subjects immediately. But I really do want to read what you have to say, and what better way than to full-on ask you to write?

So here are three blog posts I’d love to read. Write any or all of these, and you have a guaranteed bookmark in my delicious feed and my vote on sites like programming.reddit.com and dzone.com.

What I learned from Language X that makes me a better programmer when I use Language Y

Everybody has a pet language. And many smart people take a crack at learning a new language. Some take two years to try to port an existing, production application. Some read a book and throw it across the room, unconvinced that the new language offers much in the way of value. Whether you immersed yourself in the new language or merely skimmed it, what did it teach you that you can apply to your everyday work?

I disqualify things like “Programming in Assembler reminded me how much I love Object-Oriented Programming in Common Lisp.” Bzzzt! It has to be something you didn’t know before you tried Assembler. Here’s one of my own: Ruby and Javascript support a rich literal notation for collections that made me a better Java programmer: I learned the double-brace initialization idiom and now use it regularly (I liked it so much, I wrote an entire post about it.)

LISP is worth learning for a different reason—the profound enlightenment experience you will have when you finally get it.

The most amazing example of this kind of thinking is Eric Raymond’s famous quote: “LISP is worth learning for a different reason—the profound enlightenment experience you will have when you finally get it. That experience will make you a better programmer for the rest of your days, even if you never actually use LISP itself a lot.”

You may not experience a profound enlightenment from your next language. But if your mind is open to the possibilities, I bet you’ll learn a lot that will make you a better programmer for the rest of your days. Tell us about it.

Something surprising that you probably wouldn’t guess about Language X from reading blog posts

I get that Java is verbose, that static typing finds bugs and makes it easy to add certain features to IDEs, that there are a lot of jobs writing Java programs, and that there are a lot of frameworks and libraries written in Java. Great. But one more post about those subjects had better be really, really insightful if it is going to get me excited.




The Little MLer introduces ML (and its object-oriented variant Ocaml) through a series of entertaining and straightforward exercises working with lists, structures, and even deriving arithmetic from types.

Learning ML through the book’s ten brief chapters will stretch your understanding of how to leverage types and type checking to write programs that are more than just type-safe but are also semantically correct.

The surprising thing I learned from ML is the power of expressing a program’s semantics with types. It has made me a better programmer whether I’m using a weakly but statically typed language like Java or a dynamically typed language like Ruby.

What can you tell me that might not be obvious from reading the same-old, same-old blog posts out there? For example: Java’s Annotations provide a unique meta-programming mechanism that allows you to write programs that are familiar to the everyday programmer but add code-generation magic.

If I were coming to Java from the Ruby world, this might get me excited enough to really learn how to make my programs sing. I was excited to discover that the IntelliJ people have developed @Nullable and @NotNull annotations that add null-checking to Java at compile time.

The canonical example of this type of post is probably Douglas Crockford’s essay describing how Javascript is The World’s Most Misunderstood Programming Language.

My personal transformation about Idea X

Bob Sutton wrote an amazing article about one kind of smartness, Strong Opinions, Weakly Held. Now it’s great if learning a language or trying a new methodology taught you something new.

But it’s really, really fascinating if you happened to change your mind about something you used to think was critical. If you have had a 180 degree change of heart and embraced what you formerly shunned, or now shun what you formerly embraced.

And I don’t just mean write and tell me the logical reasons that Ocaml taught you how strong typing can be much more powerful than what lame-ass Language X provides. That’s true. But please, if that’s all you want to write, write “Five new things I learned from Ocaml that made me a better programmer when I use Lame-Ass language X” or “Something surprising that you probably wouldn’t guess about Ocaml from reading blog posts.”

This blog post is about YOU. Tell me about your personal journey. Give me the human story. When was it exactly that you had your Aha! moment? How did it feel to let go of years of prejudices and preconceptions? What did it feel like to take the red pill?

Now that Ruby on Rails is having its fifteen minutes of fame, people are lining up on either side of the love/hate divide and everybody seems to take its pluses and minuses for granted. But do you remember how dramatic it was when people like Bruce Tate had their “Born Again” moments? Maybe today their enthusiasm seems subjective and “unprofessional.” But there’s a human story there: when people turn their back on two, five, or even ten years of belief in something, there is a powerful story to be told.

And honestly, I want to hear your story, if you would care to tell it.

I would read any and all of the above three posts with enthusiasm and a deep respect for you stepping outside of the usual same-old, same-old blog posts about languages and tools. But if you choose to write the next kind of post, it’s going to be hard for me to get excited:

Here’s why such-and-such fhpxf tbng qvpx

I am not criticizing anyone who has strong opinions about why certain things are lame.




The Seasoned Schemer is devoted to first class functions (“closures,” as they are known to many). This book is approachable and a delight to read, but the ideas are provocative and when you close the back cover you will have learned how to compose functions out of other functions, whether you code in Java, Ruby, Javascript or just about anything else.

Just because it’s hard to prove something is lame doesn’t mean we should wander around saying that everything has its place and they’re all equally valuable and they all deserve the same real-estate in our minds.

We have to wield the axe and say NO to things, to decide that life is too short for programming in Language Z, or for struggling with Tool Omega, or for hiring people who don’t have a certain kind of degree, or whatever else you know in your heart to be lame.

But speaking as your reader, posts telling me why you’re saying “no” aren’t that helpful. They tell me a lot… about YOU and your preconceptions, not about the lame things. When I re-read my own posts about failure or about metaphors for software development, I now think they say more about me and my journey than they do about shipping software.

The most useful purpose posts about “how lame things are” serve is to help people rationalize decisions they have already made.

If I have already decided that Ruby is flawed, your post giving your seven reasons is useful ammunition when somebody asks me to justify writing Project Foo using server-side Javascript. But am I really sitting on the fence, unsure of what to do until I read your detailed critique? No.

The post that is going to push me away from the lame thing isn’t the post about how lame it is. It’s the post about the useful idea and how good it is. And that’s why I am asking—or even begging—you to write a post describing “Five new things I learned from Methodology P that makes me a better team leader when I use Methodology D,” or “Something surprising that you probably wouldn’t guess about the Factor language from reading blog posts,” or especially “My personal transformation about estimating software schedules.”

Thanks in advance.

Labels:

 

Wednesday, September 12, 2007
  We have lost control of the apparatus
I am writing to you as a fellow programmer and software developer. I write in friendship and brotherhood. My heart is heavy, and the news I impart is not good: We have lost control of the apparatus.

I know, the IT department soldiers on grimly. They lost a great battle to the PCs more than twenty years ago, and it took years of struggle with NT Domain Controllers and web proxies before control could be wrestled back from the users. In some places I hear battles are still being waged over USB ports and Bluetooth. IT was lucky to find a new ally against the users in MSFT just when their ancient supporter IBM’s star was fading.

We’ve taken care of everything
The words you hear the songs you sing
The pictures that give pleasure to your eyes
It’s one for all and all for one
We work together common sons
Never need to wonder how or why

But we programmers have lost and we must be realistic about things. The fact of the matter is this: people own their own computers, and our applications are no longer the primary way they learn how computers ought to work.

I know, I know, they stare at our work for eight, ten, or twelve hours a day. So you would think that we would set the standard for how computers ought to be. But the Good Old Days when most of users had never seen a computer before work have gone. Some of our users, fresh out of school, have already been using computers for ten years!

As if that wasn’t enough, the really bad news is, when our users go home they have this thing called the Internet. I know, IT locked that down in the office. But we can’t stop them from getting on it at home, on their mobiles, and now even on those insidious Apple iPods! And when people use the Internet, they are actually using other people’s applications.

I’m not kidding. Our users are being exposed to applications we don’t control. And it messes things up. You see, the users get exposed to other ways of doing things, ways that are more convenient for users, ways that make them more productive, and they incorrectly think we ought to do things that way for them.

computers

This business of users buying their own computers is really troublesome. For one thing, they can buy a computer in the store for a few hundred dollars that is twice as good as the warmed-up left-over IT put on their desk. Until recently, we could shrug our shoulders. We weren’t the ones that had to explain that to get an up-to-date model, they needed to fill out a twenty-seven-b-stroke-six. Which needed approval. Against budget. And then IT would order it from DELL. Who don’t have the CPUs in stock.




Blatant plug: Pre-order your Apple 16 GB iPod Touch with this affiliate link. You’ll get the most revolutionary device ever made and you’ll support ragawald, a worthy a pretty-good ok, a drivellous weblog but at least it’s entertaining. Please. And thank you, I really do appreciate the clicks!

And meanwhile, the very same users could walk across the street and buy themselves a much better PC for less money than we pay and take it home the same day. And until now, we didn’t worry about it. We’re the programmers.

But now it’s a problem. Here’s the thing: those PCs they buy at home? The ones that are two, three, or even five times better than the ones on their desk? With huge drives, and lots of memory? The thing is, they run much richer applications than the warmed-over TTY stuff we’ve been feeding them under the guise of “being hip to the Internet revolution.”

For example, my Mother uses Skype to talk to her friends. She thinks it’s normal to see all of your voice mail messages in a list on the screen. If I tried to give her a CRM application for managing contacts, the very first question she would ask would be, “Why can’t I listen to all of the voice mails from that contact in the application?”

Do you think she would have patience for my explanation that the company’s phone systems are complex and proprietary and that we can’t install Asterix just for her? She would grab me by the ear and drag me to my desk to get cracking on it!

You laugh, but users are used to a lot more functionality than a web page with a form on it these days, and if we don’t give it to them, the noise from all the whining is going to drive us insane.

I tried telling them that databases were not designed for multi-megabyte Word files and PowerPoint presentations, and they just goggle at me like I’m a circus freak and ask me why I’m using a database if it doesn’t do what they want for them?

Here’s another thing. How many monitors do your users have? One, right? There’s no point in giving them more, because we build our applications with session management that basically goes insane if they try to open two windows at once, right? Well guess what? Users have multiple monitors at home for games, and they want them at work.

And when they have them at work and you explain that no, they can’t be in the middle of doing something for XYZCorp in one window, and open a new window when a call comes in from ABCLimited, because that screws up the session, they are going to bitch and moan, because they can do two things at once in their mail application and their social network and their game. Sucks to be us.

And don’t get me started about had drives. Would you believe that users now expect—as if anyone gave them permission to have expectations—what was I saying? Oh yes, users now expect that we should store everything and anything. That’s right, they think we should be able to handle arbitrarily large text fields, with styling, and pictures, and even sound or video, all stashed away where they can find it again. They want to be able to store everything to do with an account or a customer or a project or whatever right in our applications.

I tried telling them that databases were not designed for multi-megabyte Word files and PowerPoint presentations, and they just goggle at me like I’m a circus freak and ask me why I’m using a database if it doesn’t do what they want for them? I try to tell them that we don’t have a budget for buying bigger drives, and the poor, deluded fools pull out their credit cards and offer to buy a terabyte drive over the Internet for less money than our vendors charge us per phone call.

I tell you, users refuse to sit and be trained like puppies. And the bottom line is this: we can’t keep putting HTML lipstick on a 1960s pig forever. We have to get out of the office and look at what a modern PC looks like—drive, speed, RAM, monitors, everything—and write applications that can take advantage of it.

fields

You know how all of our applications have a first-name and a last-name? This has worked for decades, you would think people would understand the value of standards, of consistency. But out there in the Internet, where there is no Adult Supervision, some of those applications have just one field for a name. You put spaces in it to separate first and last, just like writing a letter, I guess.

When I told one of our users, a business analyst, that using just one field for the name meant a huge amount of work for programmers, she actually asked me what our job was, if not to do the work that makes users productive.

Please, stop laughing. I had the same reaction. How could that possibly work? But the rogues who build that kind of heresy put in a lot of clever little rules and things so that the single field can handle case like Braithwaite, Reginald or Geroge Smithers, Esq. or even Doctor Wu properly.

When I told one of our users, a business analyst, that using just one field for the name meant a huge amount of work for programmers, she actually asked me what our job was, if not to do the work that makes users productive. I’m very much afraid that things are out of hand. I tried to explain how our database schema works, and so on, but she impatiently insisted that it was our job to make things work.

She then started lecturing me—lecturing ME!—about copy and paste, of all things. She said that it should be convenient to copy and paste between our application and other applications like Word, Excel, and even Outlook. With separate fields, you need to do multiple steps to copy a name between our applications and a letter.

I felt betrayed by Microsoft. Whatever happened to the days when people used just one application, and if they needed another we would give them an export routine? Now she wants to be able to copy a name out of an email and paste it into our application without carefully selecting the last name, first name, and honorific separately.

When I left her office, she was mumbling something about SaaS or some such. I don’t remember what she meant, but “sass” just about sums up her attitude.

If she was just one lone uppity user, I could handle it. But they’re popping up like toadstools. Just the other day another user was asking why he couldn’t paste an address as one field, including zip code. I told him I didn’t have time to explain how database columns worked, but again he muttered something disrespectful and later I saw him hefting his swingline rather menacingly while looking at our application on his screen.

Brothers and sisters, I know this is hard to take, but we’re losing this war. Our DBA cousins have brainwashed Corporate into believing that they are the custodians of the data, and of the sacred Stored Procedure that Controlleth Access. We have very little choice about how things work.

But still, the users expect us to make applications every bit as useful for them as the applications they use on the Internet, and I fear that they will rise up and revolt very soon if we don’t find a way to make the database invisible and make user applications conform to these horribly user-centric heresies.

I know, I see the pitchforks in your hands, and your desire to maul, hang, and burn the messenger is understandable. But that won’t fix things, so please put them down, ok?

stories

And don’t give me that “User Stories” flim-flam, please. I practically invented bamboozling users into doing what we want by pretending to put them in charge of an Agile Process. Agile Process indeed, anybody ought to know that if it’s a Process, it sure as heck can’t be Agile.

So you think you can do what we’ve always done, hunh? When they complain about copy and paste, write it up as a story, put it in the backlog, and then—look at that—there’s always a higher priority story to do. There’s always some new functionality that offers a greater ROI than polishing an old feature.

Well, sister, where that goes wrong is that this isn’t polish: it’s what our users expect as basic functionality. You might think it is new and improved and doesn’t add value, but what our users think is that our applications are old and broken and waste their time.

So while on paper a new feature is more important than making paste work, in practice it looks like we build software that slows the organization down.

So save your breath and stop using Agile as an excuse for slapping the crudest crap together and putting fast in front of finished.

search

You would things couldn’t get any worse. But they are worse, much worse. I’ll just say one word. Google. Those bastards are practically the home page of the Internet. Which means, to a close approximation, they are the most popular application in the world.




The authors of Prototype and Scriptaculous in Action are the people who brought you the incredible Prototype and script.aculo.us Javascript libraries. This book explains how to use them to build reusable, literate Javascript, how to build dynamic, Web 2.0 applications, and best of all, how to write web applications without tearing your hair out in frustration with Javascript.

With this book and these libraries, you’ll learn how to write better Javascript in a Lisp-like functional style, and as a bonus you’ll also learn how to write better Javascript in a conventional OO style.

And what have they taught our users? Full-text search wins. Please, don’t lecture me, we had this discussion way back when we talked about fields. Users know how to use Google. If you give them a search page with a field for searching the account number and a field for searching the SSN and a field for searching the zip code and a field for searching the phone number, they want to know why they can’t just type 4165558734 and find Reg by phone number? (And right after we make that work for them, those greedy and ungrateful sods’ll want to type (416) 555-8734 and have it work too. Bastards.)

I have tried explaining that there’s an ambiguity if an account number is also 4165558734. But those damn users just give me that “Boy, you are stupid” look that made Samuel Jackson famous. They think we should just show them what we find and let them sort it out. They’re idiots, obviously, but they’re our idiots and I’m pretty sure that if we fire them all we’ll have to clean our own desks out the following day.

They don’t even get that search results should always show stuff from the same table. Would you believe, if they type a phone number, they want us to search companies and persons. They have no respect for our careful husbanding of hardware resources. The profligate spendthrifts think that just because they have a two gigahertz PC at home that can search their entire hard drive for a phone number as fast as they can type it—thanks again, Google—we should make search as fast and as easy to use in our applications.

We can use tools like Nutch and what-not for full-text search. But users want to search everything, everywhere, just like Google. And try as we might to get them to use Sharepoint, they just deride it as a heap of junk. We are going to wind up ceding control of our data to Google sooner or later. I hate to be the one to tell you this, but you might as well hear it from a friend:

You need to start coding your applications so that an external search engine can search them. That’s right, you need to work with a desktop search tool and a network search tool so that people can type 4165558734 and see everything, mail, word docs, and records in your database, in one place.

It’s the future. A miserable, groveling future where our applications work for the users instead of the users working for our applications, but it’s our future.

Suck it up and roll with it.


Psst! Are you Smart? Do you Get Things Done? Do you want to work in New York City using Rails and Java? Mobile Commons is hiring developers!

Labels:

 

Sunday, August 26, 2007
  Ruminations about the performance of anonymous functions in naive Javascript implementations
Block-Structured Javascript (better known as the Module Idiom) looks like this:

(function () {
var something_or_other;
// code elided
return something_or_other;
})()

This creates a new, anonymous function with its own local scope. Whenever this code is execututed, the interpreter creates a function record in its memory. The exact same thing happens if you create a function and bind it to a variable with var foo = function (...) { ... };.





The Seasoned Schemer is devoted to the myriad uses of first class functions. Luckily for us, the ideas in this provocative book map directly to Javascript (see the plug for Lisp in Small Pieces below).

When you close the back cover you will be able to compose programs from functions in powerful new ways, and you can use these new techniques in Scheme, Ruby, and Javascript immediately.

Now let’s consider another common pattern, the Inner Function: we have a function, and the function needs a helper function. We define the helper function inside our function to make our code more encapsulated:

var factorial = function (n) {
var factorial_acc = function (acc, m) {
if (0 == m) {
return acc;
} else {
return factorial_acc(m * acc, m - 1);
}
};
return factorial_acc(1, n);
};

What happens when we invoke factorial it six times?

When the interpreter first encounters the code defining factorial, it creates a function and assigns it to the variable factorial. Then each time we invoke the factorial function, the interpreter creates a new function record for factorial_acc. So in total, the interpreter creates seven functions in memory, not two.

hand-rolling

If this code needed hand optimization, you might want to consider ‘lifting’ the definition of factorial_acc outside of factorial, so it doesn’t get recreated with every invocation:

var factorial_acc = function (acc, m) {
if (0 == m) {
return acc;
} else {
return factorial_acc(m * acc, m - 1);
}
};
var factorial = function (n) {
return factorial_acc(1, n);
};

This produces exactly the same result as our Inner Function version. factorial_acc doesn’t use any of factorial’s parameters or variables, so it does not really need to be inside its scope to produce the correct result.

Now you only need two function records, not seven. Two is cheaper than seven. The problem with this approach is that you are proliferating names. If you are binding functions to names in the global environment, it quickly becomes crowded. And you also have a readability issue. Does anything else need to use factorial_acc? The original code made it very obvious that factorial_acc is only ever used by factorial.

A block can help. Yes, the cause of our performance consideration—dynamically creating functions—can actually be part of the solution:

var factorial = (function () {
var factorial_acc = function (acc, m) {
if (0 == m) {
return acc;
} else {
return factorial_acc(m * acc, m - 1);
}
};
return function (n) {
return factorial_acc(1, n);
}
})();

Now what happens? Well, we create an anonymous function for our block. One function record. Within that block’s execution, we create two more functions, one assigned to the variable factorial_acc, and one returned from the block (and then assigned to the variable factorial). This code creates three function records, which is still much better than seven.

As a correspondent summarized in email, “we’ve shown how to replace a simple function containing an inner function with a block call that returns a closure referencing the inner function so as to avoid re-defining it on each call. That’s all there is to it.”

(By the way, Douglas Crockford has done a very good job of explaining this idiom in Javascript, and named it the Module Pattern. Here’s a discussion with particular emphasis on OO-style programming. And here’s a really detailed examination from the YUI team.)

So should you always rewrite inner functions to use a block like this?

I don’t personally fool around with this kind of hand optimization willy-nilly (Of course, you may find the block version more readable than the inner function version. If you do, it’s a win to write it that way). It has a cost: in a more complex function, defining helpers outside of the function may be moving them further away from where they are used, which is a loss for readibility. If you prefer the inner function version, you should be very sure you have a performance problem before you leap to the conclusion that you should rewrite it.

a heuristic for automatic optimization of inner functions and blocks

Lisp implementations have been optimizing this kind of code, automatically, for decades. That’s because Lisp programmers have been writing programs in this style for decades, either directly or using macros like let. Here’s the basic heuristic:





Lisp in Small Pieces is one of the most important books about Javascript ever written. WTF!? may be your first thought. Hold on. Javascript at its heart, a very Lisp-like language with C syntax. So understanding Lisp helps you understand Javascript.

What makes Lisp in Small Pieces special for Javascript programmers is that it illustrates the principles underlying Lisp (and therefore Javascript) by creating a series of implementations, each of which illustrates the basic mechanisms in the language.

These deep ideas are exactly the things that make Javascript different from other C-syntax languages like Java or Visual Basic. This book, more than any other, will take your understanding from knowing what works on the surface to understanding why and how it works.


There’s also a well-know optimization for making blocks themselves free or nearly free: lambda lifting. So before optimizing things prematurely, test your implementation and see if it is already fast enough for your purposes.

You may discover that you don’t save anything by rewriting things yourself. (You may make your code slower: some optimizations rely on knowing the exact scope of the code being optimized. If you proliferate names by lifting things yourself, the optimizer may not be able to use all of its tricks.)

These techniques have been known for twenty-five years. If a Javascript implementation that you are forced to target doesn’t include it, why not demand that the implementers get with the program and, you know, use some of the stuff we’ve known about programming for almost as long as they’ve been alive? Especially if they brag about their prowess at creating programming languages?

conclusion: nice-to-know, but not essential

My personal conclusion is that the behaviour of a naïve implementation is a “nice-to-know.” I don’t personally worry about optimizing it until I have a known performance issue, at which point it is essential to test to see whether some of the hand-optimizations will actually help.

YMMV.


1. Full closures make things tricky:

Functions that refer to variables in their immediate parent scope are much trickier to optimize away. Sometimes, such a function is supposed to be created anew for each invocation of its parent. For example, if you want to construct a bank balance thingy without using objects, you might write:

function (balance) {
return function (amount) {
balance = balance + amount;
return balance;
};
};

You pass in an initial balance, and it gives you a single function that you can use to deposit (pass a positive number), withdraw (pass a negative number), or check the balance (pass zero). It returns the updated balance in each case.

The inner anonymous function cannot be lifted or optimized away because of its reference to balance in its parent and because the function can be shown to “escape” its parent.

Whereas in this contrived example:

function (balance, owner) {
return function (amount) {
return {
new_balance: (function () {
if (balance + amount >= 0)
return balance + amount;
else return balance;
})(),
account_owner: owner
}
};
};

Although the inner function still cannot be optimized away, the block within it can be lifted into the inner function and removed, producing:

function (balance, owner) {
return function (amount) {
var __temp;
if (balance + amount >= 0)
__temp = balance + amount;
else __temp = balance;
return {
new_balance: __temp,
account_owner: owner
}
};
};

Labels: ,

 

Monday, August 20, 2007
  Block-Structured Javascript
Javascript provides closures with first-class access to variables declared in the enclosing environment. Besides being a handy piece of trivia if you are ever playing Programming Jeopardy, what use is this to the actual working programmer?

There are a lot of ways to take advantage of Javascript’s closures, I am going to describe just one, replicating Algol’s block structure (or Lisp’s let and begin macros, if you prefer). When we’re all done you’ll have a handy tool for making your code more readable, separating its concerns, and generally making life easier for programmers who have to read what you’ve written.

From Bricks to Blocks

A block is a chunk of code inside a function. Blocks have well-defined entry and exit points. Blocks have their own local variables and functions, and they also have first-class access to variables and functions defined around them. Blocks may nest.

Structuring code into blocks makes large functions more readable and easier to refactor. All of the variables and logic needed for one thing are encapsulated together in the blocks where they are needed, not scattered about in functions everywhere.

when a better design—if unfamiliar—is shown to developers or experienced users, they tend to reject it. Often, it takes careful explanation and having them gain experience with it before the improvement is understood.
—Jef Raskin


While the idiom may be slightly unfamiliar to some, I think you’ll agree that it is highly accessible and not some sort of über-contruct of interest only to the self-professed hacker. And when a programmer encounters it in your code, she will have no trouble figuring out what it does and how it does it.

block structured code

I think everyone can agree that structured programming is a good thing. Functions should be composed of blocks, with the blocks linked together by constructs such as if statements:

if (some_condition) {
// a block
}
else {
// another block
}

Back before structured programming was introduced, there were gotos everywhere. Looking at a piece of code with labels, it was hard to know what the flow of control might be at run time. In short, you never had the confidence that everything you needed to know about that code was right there next to that code.

Like almost all modern languages, Javascript’s blocks do structure the flow of control so that you have nice clean entries and exits from each block. And also like most other modern languages, Javascript does nothing to structure access to variables inside blocks. For example:

var j = 0;
if (some_condition) {
// a block with some code elided
}
alert(j);

What will be shown? You can’t guarantee that zero will pop up, because the blocks might modify j, like this:

var j = 0;
if (true) {
var j = 42;
}
alert(j);

This looks like a programmer error. After all, the var keyword should be used to declare a new variable. The code between the braces isn’t really a new block of code with its own variables. This is hardly a problem with trivial examples. But if you start building some larger functions, the possibility of accidentally overwriting some code looms larger. Especially once you start refactoring and moving blocks of code around.

What can we do? How about this: the “block” idiom (you can also call it let or begin if you want to sound like a Schemer). Here’s the code:

var j = 0;
if (true)
(function () {
var j = 42;
})();
alert(j);

There’s a lot of syntactic noise, what does it mean? In short, we said: “Create a new function. In the body of the new function define a variable called j and assign 42 to it. Then call that function without any parameters.” Because our new instance of j is inside a function, it is not the same variable as the j outside of the function. That can be handy.

Are there any other benefits of this idiom? Yes indeed. Sometimes you have an assignment and you need some logic on the right hand side. How do you write:

var proven = {
var n = Math.round(100*Math.random());
var total = 0;
for (var i = 0; i <= n; ++i) {
total = total + (2*i) + 1
}
return total == ((n + 1) * (n+1));
};
alert(proven);

You can’t, of course. There are two problems with trying to use braces in this case. First, Javascript only allows braces to form code blocks in conjunction with specific keywords like if and function. Second, Javascript code blocks are not expressions—they do not produce values. This is why languages like Javascript need an if statement and a ternary operator: if blocks produced values, you would only need if expressions.

So in traditional Javascript style, you have to define a function somewhere else and call it… You’ll notice our block idiom includes defining and calling a function. What if our function returns a value? In that case, we can use a block anywhere we want a value, for example:

var proven = (function () {
var n = Math.round(100*Math.random());
var total = 0;
for (var i = 0; i <= n; ++i) {
total = total + (2*i) + 1
}
return total == ((n + 1) * (n+1));
})();
alert(proven);

This new idiom allows us to make first-class blocks anywhere we like. Our blocks are expressions, and we can use them anywhere we need a value. And as above, Our variables are fully encapsulated, they do not overwrite variables defined elsewhere.

blocks vs. named functions

You may be wondering, "Why can’t we use a named function?" This is the style in languages like Python, where the Benevolent Dictator does not permit constructions like this. Here is the above code using a named function:

var proven_helper = function () {
var n = Math.round(100*Math.random());
var n_plus_1_squared = function (n) {
return (n + 1) * (n+1);
};
var sum = function (n) {
var total = 0;
for (var i = 0; i <= n; ++i) {
total = total + (2*i) + 1
}
return total;
};
return sum(n) == n_plus_1_squared(n);
};
var proven = proven_helper();
alert(proven);


I find this almost as good as the block. Since you only use it in one place, it is defined where you use it. That is good. And the name might be helpful documentation, just like a one or two-word comment. Balanced against this is the fact that you have added a new function to the outer scope. Reading it later, you might have to scan the rest of the code to see if it is used elsewhere.

There's also a very small advantage of the block over the named function: since you need two statements (one to name a function, another to use it), you can only use a named function in normal code blocks. You cannot use a named function when you need an expression, unless you resign yourself to naming the function in one place and using it somewhere else.

For example, when constructing array or hash literals, you can use expressions. A block is an expression, while two statements (one to create a named function and one to call it) are not an expression. So a named function would need to be defined outside of an array or hash literal, while a block can be used inside it, placing the code closer to where it is used.

block structure and cleaner code




JavaScript: The Definitive Guide takes the time to actually discuss the language, to explain what Javascript can do and how to do it. And of course, the book also provides an in-depth reference of every function and object you are likely to encounter in most implementations. Recommended.

Structuring code into blocks makes large functions more readable and easier to refactor. All of the variables and logic needed for one thing are encapsulated together in the blocks where they are needed, not scattered about in functions everywhere. If you see a variable declared inside a block, you know it is only used inside the block. If you see a variable with the same name outside the block—a regrettable occurrence—you know that moving or changing the block will not affect the code working with variables outside of the block.

You probably know that you can put a function inside of a function in Javascript:

var factorial = function (n) {
var factorial_acc = function (acc, n) {
if (0 == n) {
return acc;
} else {
return factorial_acc(n * acc, n - 1);
}
};
return factorial_acc(1, n);
}
alert(factorial(6));

And this is a good thing, it keeps the function factorial_acc inside of factorial. Since that’s the only place you need it, why declare it anywhere else? The fact that you can put a function inside of a function implies that you can put a function inside of a block as well:

var proven = (function () {
var n = Math.round(100*Math.random());
var n_plus_1_squared = function (n) {
return (n + 1) * (n+1);
};
var sum = function (n) {
var total = 0;
for (var i = 0; i <= n; ++i) {
total = total + (2*i) + 1
}
return total;
};
return sum(n) == n_plus_1_squared(n);
})();
alert(proven);

If you only need the functions n_plus_1_squared and sum to do this one job, in this one place, why should they be defined at top level cluttering up your code? Why force other programmers to search through your code figuring out where they are used before making changes?

Block structure may seem unfamilar at first, but give blocks a try and see whether you start finding the code even easier to read and refactor with blocks. Like me, you will find that structuring your code with blocks puts the things you use right where you use them.

update: Ruminations about the performance of anonymous functions in naive Javascript implementations.

Labels: ,

 

  Bricks
I’m just finishing off some work with a corporate client before moving back to my natural position in product development (with a really exciting company!).

It’s a good time to reflect back over what was straightforward and what was difficult, what worked and what didn’t. It has been a very positive experience overall, and I have learned a few more things. Here are a hotch-potch of thoughts about corporate projects, clumsily organized around a single metaphor.

software is not made of bricks

Although very few managers ever express it directly this way, many behave as if developing a piece of software is like building something fairly simple out of bricks. It might be something large. But it’s still fairly simple.

This is tempting. The inexperienced person thinks that bricks are easy to understand: they’re all the same: if you know how to work with one, you know how to work with them all. You can measure the progress of building something out of bricks by counting the bricks in place. And anyone can contribute, even if only by moving bricks from one place to another using brute force.


Why this image is a cliché

When you have a brick by brick mentality, deep in your soul you believe that a project contains a fixed amount of work. It’s just a pile of bricks. There are so many screens, so many lines of code. You think this to be true because when you examine finished applications, you can count these things. So you engage in a discovery step up front where you estimate how many screens and how much code will be needed, then you play some games with numbers of people and the amount of work they can do per day, and out comes an estimated ship date.

You believe that since the finished work contains a fixed number of bricks, it is possible to know in advance how many bricks will be needed, and where they belong in the finished software.

This model of software development leads to several natural assumptions about how to organize a project. These assumptions are logical consequences of the belief that software is made of bricks:

assumption: it’s all abut moving bricks

The brick by brick mentality thinks of software development as making a pile of bricks. Think of the stereotypical Egyptian Pyramid as an example. There are so many bricks to pile and then you’re done. If it’s all about moving bricks, any work that moves bricks contributes to the success of the project.

That’s a comforting thought. Just keep those bricks moving. This helps us with all sorts of problems. Some people debate whether star programmers really are twenty times more productive than doofuses. Who cares? As long as the doofus can move bricks, eventually the work will get done.

So if you have a poor performer, someone who is slow and not very careful, you can use them on a project. Just find the right place for them where they can’t accidentally wreck the whole pyramid, and they can help. Ok, they are not good with the tricky booby traps or aligning the windows to allow light to strike the altar at the solstice. Fine. But what about ferrying bricks from the dhow to the base of the pyramid? Doesn’t that move the project forward?

Can’t you hire almost any warm body with ten working fingers and put them to work somewhere? Perhaps they can fiddle with page layouts, or copy the work of more experienced developers when implementing new features that are similar to existing features. But an extra pair of hands is always helpful, right?

software is more complicated than bricks

This assumption is wrong. The reason it is wrong is that software is deep. It is not a simple pile of bricks. Examining a finished piece of software, it is easy to discern surface forms like patterns, variable names, or rough organization. But the motivations for these choices are often subtle and opaque to the journeyman.

You can observe this the next time you are interviewing developer candidates. Ask them to name a design pattern: perhaps they respond, “Singleton.” Design patterns are surface forms. Now ask them to explain what problem the pattern solves. They respond, “Ensuring there is exactly one of something.” We are still working with the surface form.

Ask why we just want just one of something like a database connection pool. What problem are we solving? Why can’t we use class or static methods to solve this problem? What are the real-world issues with having 1,000 threads sharing a single database connection pool? How would you build ten pools? Or share connections without a single pool?




Critical Chain is an amazing book. The narrative form—a novella detailing a technical project team and their search for a way to manage an uncertain process—is a big win, it highlights the important ways that Critical Chain Project Management handles risks and uncertainty and makes it visible where everyone can manage it.

The section on estimating tasks alone is priceless. If you can’t afford a copy and your library doesn’t stock it, borrow mine. You must read this book if you participate in software development teams.

All of these questions drive at the deeper issues underlying development choices. A developer who treats their work as moving bricks, who simply copies the surface form of code they encounter, is oblivious to the motivations behind the code. They do not know when to use it and when to forgo it. They do not understand alternative ways of solving the same problem. They reproduce it blindly (and often incorrectly).

The result is software that superficially appears to be of acceptable quality because its surface form has things in common with good software. However, just because good software may be constructed out of commands and strategies, this does not mean that software constructed of commands and strategies is good.

What is needed on a software development project are people who understand the nuances, the requirements, the underlying problems. If you think that you are building a pyramid, what you want are architects, not slaves.

When you add people to a project who do not deeply understand their work or the problems the project faces, you create the superficial appearance of progress (look at all the bricks!), but you are slowly building up a mass of unworkable code. You may not see the problems immediately, but in time you will discover that everything they have touched needs to be re-written.

determine the baseline competence required for a project and don’t violate it

Once you understand that software is not a simple pile of bricks, you understand that the minimum level of competence required to contribute positively to a project is non-trivial. You can decide for yourself whether you need the mythical great hackers or not. But there is a minimum level of competence, and if you do not allow persons below that level onto your project, you will succeed.

If fact, you are far better off with a core of competent people and no sub-competent help than you are with the same group of people and the assistance of “helpers” on the side. Those “helpers” require three or four times as much management attention as the core contributors if you are to keep them from breaking things. And as we’ll see below, re-organizing your project such that there are tasks to keep them busy is usually harmful in and of itself.

Protecting yourself from people unlikely to make a positive contribution may require adroit maneuvering on your part. On one project, I explained that we could not complete the work in the time requested by the client. The response was to offer us some part-time assistance by employees of the client. Those particular employees may have been talented, but their experience was not a direct fit for the technical work of the project, and they did not have a full-time commitment to the success of the project.

Rejecting such “assistance” is tricky: other managers may have trouble with the idea that the project will move more slowly with the extra help, rather than move more quickly. Those managers see your project as a pile of bricks, and you’ll need to educate them if you are to avoid disaster.

software development is difficult to parallelize

The metaphor of a pyramid being built, brick by laborious brick is useful for illustrating another principle. When you assume that an application is a pile of a million bricks, you assume that you can move bricks in parallel. You can have one thousand people on the project, and if each places one brick per hour you will move forward at a constant rate of one thousand bricks per hour.

Software is not like this. Parallelizing development has serious and sometimes fatal consequences. The main problem is that the pieces are usually coupled in some way. There are techniques for lowering coupling between “bricks,” but when you set out to place two related bricks simultaneously, you must, perforce, do some kind of design or thinking ahead of time as to how they relate so that you can place them properly.

Consider two pieces, A and B. The natural dependency between them is that B depends on A. The right thing to do is to build A, and then B when you are happy with A. But the zealous manager with bricks on her mind asks, “why can’t we decide on an interface, I, between A and B, then build both at once?” They want to build I, then A and B simultaneously.

Of course, this constrains A and B tremendously. As you are building them, any flaw or shortcoming with I you discover as you build the pieces will result in rewriting both A and B. Only you are under time constraints so, you just patch and kludge, because the schedule does not have time allocated for redoing things: your motivation in parallelizing A and B was to save time, so the schedule has no room for the possibility that it will take longer to write A and B in parallel than in series.

This makes no sense to the person who thinks software is made of bricks! Looking at the finished brick, what’s the problem, it takes x hours to make A, y hours to make B, why would making them in parallel take longer than x + y instead of roughly max(x, y)?

Try the following: give piece A to one person, wait for it to be done, and then give piece B to another. Whoops, when the person working on B has a question about how A works, they have to track down the author and interrupt her. And if working on B teaches you something about A, is the person working on B supposed to change A? Or is the original developer supposed to backtrack and change it?

This explains a well-known nugget of wisdom: One reason adding people to a late project will cause it to slip further is that you are increasing parallelism. If the project was originally at or beyond its natural limit, further parallelism lowers productivity.

Or another example. You have 100 reported bugs to fix. You have 100 people. Do you assign one bug to each person? No way! Experience shows that bugs are rarely fully de-coupled form each other. You have to analyze the bugs as a team and try to guess their causes and relationships. If bug forty-nine is a simple text change on a page, anyone can fix it. But if bugs one, four and nine are all related, you need one contributor to address them simultaneously. Sending three people in to fix them in parallel thing would be a disaster.

Any time two or more pieces are strongly related either by design or by coupling in the application, it is a mistake to give each one to different people to build or fix.

In software, you want to minimize dependencies between pieces, which in turn means being very, very careful to minimize parallelism. Obviously, there must be some parallelism on any project with more than one contributor. But every project has a natural maximum amount of parallelism. Gratuitously chopping tasks into bricks to increase parallelism beyond this natural limit lowers productivity rather than increases it.

how to make the team twice as productive without parallelizing everything

What if you need two pieces, A and B, and you can’t wait for the normal amount of time to develop A and then B? Here’s an idea: instead of treating them like bricks and trying to develop A and B in parallel, why not simply hire one person who works twice as quickly? And have them develop A and B in series?

Think about this for a moment. There are a lot of claims out there that good people are three, five, ten, even twenty times as productive as the average. This seems intuitively wrong: when you look at their finished work, it rarely looks that much different from the work of the average person. So you figure the claims can’t be correct.

The finished work of the allegedly great person doesn’t look too outlandish. Ok, it has map and reduce instead of loops, and now that we look at it, the so-called great person seems to deliver fewer bricks, not more. What’s going on?

Let’s think about bricks for a moment. What if this essay right, and many times building bricks in parallel takes more time than building bricks serially? What if it’s very hard to coördinate the interfaces and contracts between pieces that are built by different people?





Twenty years later, The Mythical Man-Month: Essays on Software Engineering is still one of the most important books ever written about developing software, from the small to the large. Read the book that spawned the expression, “there is no silver bullet.”

If most projects assign related bricks to different people, and most projects further compound this error by trying to “exploit parallelism,” you can get a big productivity win just by bucking the popular choice and asking one person to do all of the related work themselves. They’ll be as productive as a team of other people simply because they aren’t burdened with the heavy cost of parallelism and from the wrong people working on pieces.

Of course, you need someone who is able to keep two pieces in their head at one time. That’s one of the advantages of hiring good people: they don’t necessarily need to build things that are twice as complicated: if they can keep twice as much in their head at one time, they can build related things without incurring the costs of splitting development between people.

software is transfinite

The other wrong assumption about software being like bricks is that you can measure progress on a software development project by examining physical features of the software, by counting bricks.

The underlying thought is that you imagine the finished software as a pyramid of bricks, a pile of them. You count how many bricks will be in the finished application. Now you can measure your progress by counting how many bricks are “done.”

This is very wrong, and it leads to troubled projects. The first problem with this assumption has been given above: if you need a million bricks for your application, you ought to be able to make use of absolutely anyone to move the project forward in some small capacity. As long as they move a brick an hour, they are helping. So, a brick a day, a million bricks… let’s employ 1,000 sweating slaves for ten hours a day for one hundred days and we’ll have our pyramid. All we need are an architect and a team of overseers with sharp whips to see to it that they work without flagging.

But what happens when the millionth brick is placed and we are nowhere near completion? It turns out that software’s requirements are fluid, so fluid that you could place as many bricks as you like and still not be finished.

Measuring software productivity by lines of code is like measuring progress on an airplane by how much it weighs.
—Bill Gates


In fact, moving a lot of bricks is counterproductive: the physical manifestation of software, like written code, design specifications, unit tests, and so forth, have mass just like bricks. And if you want to redo things, the more mass you have, the harder it is to move and reshape it. In that sense, software is exactly like bricks.

Only, what you want to is to move the minimum number of bricks required to test your assumptions about whether the software is complete enough. It will never be complete, but trying to measure completeness by bricks is wrong.

There are only two meaningful ways to measure progress on a software development project: the first is to ask the team to estimate how much work remains, given the most up-to-date expectation for the form of the finished application. The second is to measure customer satisfaction.

how to measure progress on software development projects with estimated work remaining

Given the most current understanding of what is to be done to complete the application, it is meaningful to ask the team to estimate how much time will be required to complete the work.

This sounds conservative, even traditional. Doesn’t every project do this when they prepare the plan? What’s the catch? The catch is, if you only do it once, you only know your progress once. This differs markedly from the traditional model, where you plan once, estimate once, and thereafter you measure progress against your plan, rather than estimating again.

As the project progresses, the client’s requirements change. This is especially true if the client is given the opportunity to interact with the team and engage in the learning process: Agile’s claim that requirements are subject to change is a self-fulfilling prophesy.


To measure meaningful progress, you must re-estimate on a regular basis. If you wish to give a meaningful progress report every two weeks, you must ask the team to estimate the work remaining every two weeks. if, instead, you simply take how many bricks you thought you needed a few months ago, count how many have been moved to date, and calculate the work remaining through simple subtraction, your reports are drifting further and further away from reality every fortnight.

We know that this does not work, both from experience in the field and from critical thinking: as the project progresses, the client’s requirements change. This is especially true if the client is given the opportunity to interact with the team and engage in the learning process: Agile’s claim that requirements are subject to change is a self-fulfilling prophesy.

We also know that as we work with the software itself, we learn more about how much work is required to complete the software. For example, if you load a project’s team up with in appropriate contributors, maximize parallelism, and perhaps go three for three by minimizing testing and bug fixing, you will compound a tremendous “technical debt” over time.

Measuring “progress” against the original plan does not include the technical debt in your estimate of the work to be completed. Asking the team to estimate the amount of work to be done gives them an opportunity to factor the consequences of technical debt into their estimates. Or of any other factor that reflects what the team is learning over time.

In essence, you have an opportunity to include off-balance sheet items in your measurement of progress, whereas measuring against bricks would have excluded those factors.

how to measure progress on software development projects with customer satisfaction

Measuring customer satisfaction is easy. All you have to do is ask the customer. A successful project increases satisfaction over time. An unsuccessful project does not. I boldly posit that any project that increases customer satisfaction over time is a successful project, regardless of what was originally written in a specification.

There is no simpler or surer way to increase customer satisfaction over the long term than to let them experience the application as it grows and to rate your progress by how much their satisfaction increases with the software itself.

Customer satisfaction is a key metric because software is not a pile of bricks. It is impossible to predict with certainty the set of requirements that will result in maximum customer satisfaction at the end of the project, so you must measure satisfaction as you go. That being said, there is a pitfall looming when you ask the customer to judge their own satisfaction.

Some customers have difficulty understanding the features and characteristics of software that will meet their needs in a cost-effective way. They have trouble distinguishing good software from bad, good applications from lemons.

Although such a customer may need a not-so-big application, they may demand an Enterprise Solution.

This manifests itself in the customer demanding proof of progress in the form of elaborate documents, plans, and diagrams instead of working software that solves a portion of their problem. It manifests in the customer demanding proof that you’ve “hired up” to meet their needs. Although these things have their place, none of them are working software. They are promises to develop software, and subprime promises at that.


Life is not all project management and wrestling with customers. To Mock a Mockingbird is the most enjoyable text on the subject of combinatory logic ever written. What other textbook features starlings, kestrels, and other songbirds? Relax with a copy and let your mind process the business side of development in the background. You’ll be a better manager for it!
It is easy to obtain short-term success with such a client: deliver what they ask for. This is the exact business model followed by real estate developers specializing in inexperienced, first-time buyers: offer the superficial features that provide short-term excitement at the expense of long-term satisfaction. In the case of software, you can dazzle the inexperienced customer with head counts, power points, and diagrams showing Jenga-piles of technology.

Should you do this? That is up to you, it depends on whether you wish to build short-term customer satisfaction at the expense of long-term satisfaction with the software itself. If you wish to deliver long-term satisfaction with the software, you may need to educate the customer to focus on the software itself.

And that means delivering increments of a functioning application for the client to experience. There is no simpler or surer way to increase customer satisfaction over the long term than to let them experience the application as it grows and to rate your progress by how much their satisfaction increases with the software itself.

building software without treating it like a pile of bricks

Sometimes, it really does boil down to a few simple ideas, working in concert:

  1. Hire people with a minimal competency. Do not be seduced into accepting “help” from people who are not able to contribute to the team at this level.

  2. Minimize parallelism. Exploit the talent of your best developers by giving them chunks of related work.

  3. Measure progress by continually re-estimating the work to be done and by customer satisfaction. Educate the customer to prefer completed work over documentation and promises.

Labels:

 

Tuesday, July 31, 2007
  Abbreviation, Accidental Complexity, and Abstraction
Modern programming languages provide a variety of mechanisms for translating a relatively short program into a huge number of instructions for the computer’s CPU. It is tempting to think that the purpose of “high level languages” like Java, C#, Smalltalk, Ruby, or even Lisp is to be a kind of decompression algorithm: you type 147 lines of code, and the compiler elaborates each line of code, producing several megabytes of executable.

If it were as simple as that, we would say that the “highest level language” is the one that allows us to express our programs in the smallest source code, perhaps in fewer symbols or lines of code. For example, we would say that (1 x shift) !~ /^1?$|^(11+?)\1+$/' is superior to:

function (x) {
y = Math.floor(Math.sqrt(x));
var a = new Array();

a.push(2);
for (i = 3; i <= x; i+=2) {
a.push(i)
}
a.reverse();
var primes = new Array();
while ((current_prime = a.pop()) < y) {
primes.push(current_prime);
for (index in a) {
if (a[index] % current_prime == 0) {
a.splice(index,1);
}
}
}
return a[0] == x;
}

Strictly because it is smaller by all obvious metrics (source).1





Twenty years later, The Mythical Man-Month: Essays on Software Engineering is still one of the most important books ever written about developing software, from the small to the large. It’s time to read the book that spawned the expression, “there is no silver bullet.”

This explanation is clearly wrong. Even if both examples produced exactly the same result, the former is almost impossible to use by mortals: its form obfuscates its output. Clearly, writing the smallest possible program is not the goal.

Writing smaller programs is also not an anti-goal: longer programs are not automatically “easier to read and understand.” One of the problems with longer programs is that they often are longer by virtue of containing accidental complexity, swathes of yellow code.

abbreviations

Some shorter programs are shorter merely because they contain shorter constructs. For example, if you perform some regular expression pattern matching in Ruby, you can use / characters to delimit your regular expression. That’s an abbreviation for the more tedious Regexp.new(). And there are some global variables that are automatically set for you. For example, $1, $2 and so on are set to the matching groups.

So if you write /(fu|foo)(bar|blitz)/ =~ 'my program went fubar', $1 is automatically assigned the string 'fu' and $2 is assigned the string 'bar'.

Compare and contrast this to: my_matcher = Regexp.new('(fu|foo)(bar|blitz)').match('my program went fubar'). Now you can use my_matcher[1] and my_matcher[2] to extract 'fu' and 'bar'.

Obviously, the former expression is shorter, and quite handy. And while it may look a little cryptic to someone raised on Java’s one-size-fits-all syntax of everything is a .message, it really isn’t an obfuscation. It’s an abbreviation, nothing more. It makes programs shorter without changing their meaning in any substantial way.

accidental complexity

We mentioned earlier that longer programs are sometimes longer by virtue of containing accidental complexity. There’s a good point of comparison. If a shorter program is shorter by virtue of having less accidental complexity, it’s better. It has a higher ratio of signal to noise.

For example, here is one of the new for loops in Java:

for (Account account: customer.getAccountList()) {
// do something
}

This is shorter than:

Iterator iAccount = customer.getAccountList().iterator();
while (iAccount.hasNext()) {
final Account account = (Account) iAccount.next();
// do somethinmg
}

It also removes some of the accidental complexity of the iterator. The new for loop removes some accidental complexity, raising the signal by eliminating noise. To continue with the same example, let’s look at an old for loop:

for (int i = 0, i < customer.getAccountList().size(); ++i) {
final Account account = customer.getAccountList().get(i);
// do something
}

This has even more accidental complexity, a loop index variable. Eliminating the loop index is a decent win, it eliminates fence post errors. But there is a bigger win in moving from an index-based loop to an iterator based loop or a new for loop: we have abstracted away the notion that the collection must be indexed by consecutive integers.

abstractions

The iterator (and the new for loops) work with all kinds of collections, including linked lists and sets. Moving from a loop index variable to an iterator does more than just abbreviate the code, it does more than hide some accidental complexity, it provides a general-purpose abstraction for operations on collections.





How do the experts solve difficult problems in software development? In Beautiful Code, leading computer scientists offer case studies that reveal how they found unusual, carefully designed solutions to high-profile projects. You will be able to look over the shoulder of major coding and design experts to see problems through their eyes.

This is not simply another design patterns book, or another software engineering treatise on the right and wrong way to do things. The authors think aloud as they work through their project’s architecture, the tradeoffs made in its construction, and when it was important to break rules. Beautiful Code is an opportunity for master coders to tell their story.

So here is another point of comparison: does the shorter program provide us with a useful abstraction? Some programs are shorter through mere abbreviation, some are shorter through hiding accidental complexity, and some are shorter by providing useful abstractions.

The difference between a new for loop and an index variable for loop may seem subtle. So let’s bring out a canonical example, one we touched on earlier: regular expressions. Can anyone seriously doubt that /(fu|foo)(bar|blitz)/ provides a powerful abstraction compared to a stack of loops and indexOf method calls?

There is more than abbreviation involved, more than hiding the accidental complexity of indexOf, there is a whole new mental model involved. A regular expression is declarative, it specifies what you what to find, and leaves the how to the language implementation. It is shorter, yes. But it is also much more powerful because it provides the programmer with a huge mental lever.

Active Record provides a very useful abbreviation that eliminates a large chunk of accidental complexity, dynamic finders. You can write User.find_all_by_street_and_city(street, cities). I won’t say what it returns, I trust it’s obvious.

You could easily write a find_all_by_street_and_city method in any language you care to name. Agreed. But if you write one yourself, you have to write one for ever different kind of query you need to make. And if you write it, you trust it.

But if you are maintaining someone else’s code, do you really trust it without reading it? Or do you have a peek to see whether there’s some weird business logic in there, like some special case treating the abbreviation “Hogtown” as a substitute for “Toronto”? Repeat this process for each search abbreviation method in the code base. What if one has a bug? Or another has a specific eager loading behaviour?

If you are using an ORM like ActiveRecord, once you’ve learned how dynamic finders work, you know how they all work. Furthermore, you have an abstraction you understand, you don’t have to peek under the hood to see what’s going on. Abstractions are better than abbreviations.

abstractions are not abbreviations

Abbreviations are useful. They can make code more readable by putting the all of the essential workings in one visible chunk. But they aren’t as powerful as constructs that remove accidental complexity or provide abstractions.

And some times, abbreviations are even harmful. If the programmer reading code must understand what is being abbreviated in order to understand the code, then the abbreviation merely forces the programmer to jump around the code to figure anything out. When programs are written like this as a matter of course, the poor programmer is forced to rely on powerful IDEs that can jump to method definitions or find references quickly. She has to have these tools, because she must read all of the code to understand what it does.

The abbreviations have introduced complexity, not removed it.

Where do such programs come from, programs where the abbreviations are not useful abstractions? From those same IDEs, of course, from mindlessly refactoring to eliminate duplicate code without stopping to design the program’s mental model.

This is not a knock against powerful IDEs, far from it. But we should realize that all the same arguments raised about powerful programming languages (“operator overloading is dangerous in the hands of mediocre programmers,” “macros enable people to write unreadable programs,” and so forth) apply to tools that shuffle code around, especially when the same tools seem to make it easy to navigate the shuffled program.

When composing our own programs, when using these tools, it is not enough to merely seek to eliminate duplication. We must be mindful of the distinction between abbreviation, removing accidental complexity, and introducing useful abstractions.

It is not wrong to eliminate redundancy in code. But when we do so, we mustn’t follow the path of least resistance and mindlessly perform the refactorings suggested by our tools. This argument exactly parallels the argument about making code shorter for its own sake. Code brevity in and of itself is not desirable, well-abstracted code with a minimum of accidental complexity is desirable, and brevity follows when these goals are attained.

Likewise, elimination of redundancy is not desirable in and of itself. But it serves to warn us of the need to seek useful abstractions and to remove accidental complexity. When we work with those goals in mind, the redundancy likewise melts away, and we are able to use the tools to improve our code.2

Abbreviations might be good.
Removing Accidental Complexity is better.
And providing Useful Abstractions is best of all.



  1. And there's another difference: is 121 a prime number?
  2. Thanks, jbstjohn!

Labels:

 

Monday, July 16, 2007
  How to Run Javascript on the JVM in Just Fifteen Minutes
what and why

This post is about executing Javascript inside the JVM without using a browser. Besides the fact that people are talking about running Javascript on the server (again, and again), here’s why my colleagues and I used it on a recent project:

We have some logic that needs to run on the server and on the client, depending on when the application applies it. There is like an incredibly complex form validation involed. Think of a loan application, for example. Zillions of rules like “at least five years at current location or at most three locations in ten years or owns current location for at least one year.” The whole thing forms a big logical expression that needs to be evaluated in such a way that we can report which pieces are missing or do not meet requirements (Declined because income is insufficient and does not state purpose of loan).

There are a couple of ways to handle this. One is to submit the form back to the server for validation. Another is to write everything in Java, but use a sophisticated tool to render the Java into Javascript. Naturally, our team chose a third option, The Rails Way (available for pre-order).

We have a Domain-Specific Language for describing the rules. Business users use the DSL, and another tool writes code from that. We could, in theory, write Java methods for the server and write Javascript for the client. We chose to start with Javascript, and we’ll write Java for the server if running Javascript on the server turns out to be unperformant slow.

In the mean time, we decided that having some Java make a simple function call to a Javascript function and process a simple result was a reasonable first step. As a side benefit, we run all our server-side Javascript unit tests in Java test suites alongside our Java unit tests.

And after some fiddling around, we got Javascript working on the JVM. My bet is that you can get it working too, and it won’t take more than fifteen minutes.

Care to try it?

step zero: the Java Virtual Machine (JVM)

You’ll need JDK 1.5 or 1.6 from Sun. If you already have this, move on to step one. Still reading? You’ll need to do a big install before we go further.

Go to the downloads page and download the latest thing they have on offer with the words “JDK” in it. You won’t need JEE (the framework formerly known as J2EE) for this exercise, but if you know what it is you know enough to decide whether to download it.

Right now, you want JDK 6u2. Go get it and suffer through the installation process.

Step one: Bean Scripting Framework

Java6 has a new framework for running “scripting” languages, and it’s built into Java6. We’re not going to use it today, just because some of you may still need to make stuff work with JDK 1.5 in production. Instead, we’re going to go get the Jakarta Bean Scripting Framework (BSF). You can download it here. We’ll need bsf.jar.

step two: fix gotchas

YMMV, but I found that I couldn’t get BSF working without including the Jakarta Commons-Logging jar. So if you don’t have this floating around, go here and download it. I experimented, and I could ignore everything except commons-logging-1.1.jar. If that was missing, BSF kakked.

step three: Rhino

Since we’re going to run Javascript, we need an interpreter. Rhino to the rescue. Download it. We’ll need js.jar.

step four: keeping things organized

Ready to code? Let’s start with a directory for all of our stuff. Call it hello_javascript. For the sake of keeping thing simple, set up the sub-structure as follows:

hello_javascript
hello_javascript\lib

You may be using a fancy IDE, you may be using a text editor and have to graft your classpaths together with chicken wire. The important thing is that your classpath, besides including all of Java’s required stuff, and your own Java classes, also includes bsf.jar, commons-logging-1.1.jar and js.jar.

We’ll put all three in the lib subdirectory:

hello_javascript\lib\bsf.jar
hello_javascript\lib\commons-logging-1.1.jar
hello_javascript\lib\js.jar

step five: “Hello, Javascript”

Let’s write some Java: create the following subdirectories and put a file called HelloJavascript.java in it:

hello_javascript\com\raganwald\public\HelloJavascript.java

Let’s give it some code:

package com.raganwald.public;

import org.apache.bsf.BSFManager;

public class HelloJavascript {
public static void main (final String[] argv) {
final BSFManager manager = new BSFManager();
final Object jso = manager.eval("javascript", "(java)", 1, 1, "'hello, Javascript'");
System.out.println(jso.toString());
}
}


Run your new Java application. Did you see that? It interpreted some Javascript in the JVM without a browser. Check your watch. Did you need more than a quarter of an hour? I didn’t think so.

You can try more ambitious code:

 manager.eval(
"javascript", "(java)", 1, 1,
"var f = function (what) { return 'hello, ' + what; }; f('Javascript);");


including other files is an exercise left for the reader

I didn’t find an easy way to get Javascript files to include other Javascript files. This isn’t the worst thing in the world, but you certainly don’t want to write anything substantial inside of Java strings. So try experimenting with reading javascript files right off the classpath.

I created a subdirectory called javascript:

hello_javascript\javascript

And you can read Javascript into your strings or Stringbuffers with some fairly simple code, thanks to a utility built into BSF:

import org.apache.bsf.util.IOUtils;

// ...

static String readScript(final String fileName) throws Exception {
final FileReader in = new FileReader(fileName);
return IOUtils.getStringFromReader(in);
}


That reads some script into a string. You can then prepend it to whatever you want to evaluate. Note that if you want to set up some sort of simple checking to make sure that you don’t “include” the same file twice, you will need to write yourself a little framework for that, perhaps using a Set to keep track of what you’ve already loaded.

garbage in, garbage out




Prototype and Scriptaculous are the Javascript libraries that make slick transitions and UI effects easy one-liners. Prototype does more than just make an application look good: it adds Ruby and Smalltalk-like methods for handling Hashes, Arrays, and the DOM.

This book is one of the fastest ways to get up to speed on taking Javascript to the next level.

This is nice, and with a little work you could make a program that reads paths to Javascript files off the commend line and executes them. But to make things really interesting, you want to find a way to get Java data into your JavaScript and do something useful with the results, not just print it as a String.

BSF provides a way to inject objects into the scripting language’s environment, so you could use that facility. When writing automated unit tests for that particular project, I chose a simpler route: I serialized the data into JSON and used that to call a Javascript function directly via BSF:

manager.eval("javascript", "(java)", 1, 1, 
"myJavascriptFunction(" + myJSONString + ");");


This is a really bad idea if your JSON is handed you from an insecure source, such as a public web page calling you back via XMLHttpRequest, but if you trust your source, this works wonderfully.

Now what do you do with the result? If you are generating something esoteric like a Javascript function, I have no idea. In my own case, I return all values as simple trees of Hashes (Javascript objects without any special methods) and Arrays. I convert those into Java trees of Maps and Arrays:

import org.mozilla.javascript.NativeArray;
import org.mozilla.javascript.ScriptableObject;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

// ...

static List unwrapNativeArray (final NativeArray na) {
return new ArrayList<Object> () {{
for (int i = 0; i < na.getLength(); ++i) {
add(unwrapNative(na.get(i, null)));
}
}};
}

static List unwrapPrototypeArray (final ScriptableObject sObj) {
return new ArrayList<Object> () {{
final List<Object> sObjIds = Arrays.asList(sObj.getAllIds());
for (int i = 0; sObjIds.contains(i); ++i) {
add(unwrapNative(sObj.get(i, null)));
}
}};
}

static Map unwrapObject (final ScriptableObject sObj) {
return new HashMap<String, Object> () {{
for (Object id: sObj.getAllIds()) {
put(id.toString(), unwrapNative(sObj.get(id.toString(), null)));
}
}};
}

static Object unwrapNative (final Object obj) {
if (obj instanceof NativeArray) {
return unwrapNativeArray((NativeArray) obj);
}
else if (obj instanceof ScriptableObject) {
final ScriptableObject sObj = (ScriptableObject) obj;
final List<Object> sObjIds = Arrays.asList(sObj.getAllIds());
if (sObjIds.contains("keys")) { // a prototype enumerable/hash
return unwrapObject(sObj);
}
else if (sObjIds.contains("flatten")) { // a prototype enumerable/array
return unwrapPrototypeArray(sObj);
}
else return unwrapObject(sObj);
}
else return obj;
}

Check your watch. Are you still under fifteen minutes? Great!

Labels: ,

 

Wednesday, July 04, 2007
  Certification? Bring It On!
Not too far in the distant past, I was persuaded to give my résumé to a recruiter. He was trying to place a development manager for a growing company, and they wanted heavy Agile experience, deep management experience, and someone with some technical chops. Well, I figured that twenty-plus year of experience, with something like eight years of legitimate Agile, leading a team of twenty-plus, producing a product that won several Jolt awards and a JDJ Editor’s Choice award… I thought I was a lock for an interview.

But an email came right back: The client is wondering where you got your degree. Twenty years of experience, and they want to know how I spent my time in the mid-eighties.

This got me thinking about certification. It’s another long-running debate. And something funny has happened to me. I’ve switched sides. I’m actually in favour of certifying software developers. Yes, I am in favour of disqualifying intelligent programmers from professional employment if they do not possess a little piece of paper from a certifying agency.

Deep breath. Wait for the room to stop spinning. Or is my head spinning around on my shoulders (heh)?

the catch

Like everyone else in favour of certification, I have my own ides about what skills and knowledge you need to demonstrate to get your certification. Unlike everyone else, I think I would fail my own certification if I didn’t do a whole lot of studying. That’s because I think our industry is undemanding, very undemanding. I know a few people who would pass my certification without studying. But only a few.

Before I tell you what’s on the final examination, let’s talk about what isn’t:


By now, you are thinking, “Raganwald, this certification is worthless. You are excluding just about everything we know about writing great software. What’s the point?”

Let me explain. My certification does not say you are any good at coding software. Let me repeat. My certification does not say that you are any good at coding software. I’ll let the marketplace decide. I am not telling businesses, “Hire certified programmers, they are great coders.”

Why should I? Business is perfectly happy to hire programmers with Comp. Sci. degrees, and there seems to be little or no evidence that a Comp. Sci. degree says anything about your ability do deliver working software. So why should my certification raise that bar?

what’s on the exam

Now let’s talk about what is on the exam. Just one subject, but the exam goes into excruciating detail about that subject. If you don’t know this subject cold, I am not going to certify you. Period. No debate, no negotiation, no “equivalent experience.”

The one subject? Testing and Quality Control. That’s right. All I care about is that if you are asked to make bulletproof software, you know how. I’m going to ask you about:


And more:


And a whole lot more on top of that. There is room for debate about whether to have separate testers or whether programmers should test themselves. You are not getting certified with me unless you know how to do it both ways, and can write a comprehensible essay describing the relative advantages and disadvantages of both. I am not going to require you to know the latest programming frameworks, but you are not getting certified unless you demonstrate up-to-date knowledge of the latest continuous integration tools.




Test Driven Development: By Example is THE book that ignited a revolution in software development practices. Whether you are developing in an Agile environment or working from a telephone book specification in a Waterfall project, Test-Driven Development will show you how to write automated tests that work to actually shorten your development time and clarify your code.

And best of all, this book uses actual projects as an example. This is not an exercise in theory, this is a practical tome full of practical advice. You’re already familiar with the concepts, reading this book will dive into the details that will make your coding more effective.

I don’t care if you know how to write a great architecture document. But I will fail you if you can’t write a good code review. The same goes for everything else. I will not demand that you do it a specific way, but you will prove that you have state-of-the art knowledge of how to ensure that software is solid and does what you expect it to do, no more, no less.

And you know what? After you get your piece of paper, you’ll need to work for at least a year under the supervision of a certified leader to get your upgraded “practitioner” certification. I want you to practise continuous education.

Does this mean that I am going to certify all of the QA Analysts in the industry while barring the programmers from work? Well, let me ask you: do most, a few, or any of the QA Analysts you know have a deep knowledge of software quality and methodologies? Can they write an essay describing the cost of bug fixing comparing early vs. late detection? Can they talk about various unit testing tools? Can they measure code coverage? Can they look at a piece of code with 93% coverage and tell whether the missing 7% has one or more crucial cases missing?

If so, I want to certify them. If not, they need to hit the books with me.

where did I get this crazy idea?

In our industry we have wasted millions of person-years debating the relationship between software development and architecture/engineering. Last weekend I went to see Pixar’s Ratatouille with my son. And it struck me: we should be like chefs.

Do I demand that the chef in a restaurant use a certain kind of stove? Cook a certain kind of food? Manage her kitchen a certain way? Non! The marketplace decides all these things. And the marketplace works for this. What do we demand? What do we require of restaurants?

Safety. We demand that if they serve the public, that they have certain fire safety standards. That they have certain food cleanliness standards. That they know enough about food not to poison us by accident. Of course, a cook learns to cook when getting their designation. But the thing we really demand of them is that they keep us safe!

If we order Chicken, we do not want the Fish to come out and put us into Anaphylactic Shock. Cooks know that mixing up Chicken and Fish is fatal, while mixing up Basil and Oregano is not. So they have a different protocol for handing the two kinds of food. They keep us safe.

And if I am placed in charge of certification one day, that’s what I will demand. Keep us safe. Don’t leave back doors and XSS vulnerabilities in your code. Don’t store our passwords in the database. Don’t deliver code that is full of undiscovered bugs.

If someone can be relied upon to write software that is safe, I will not dictate how they do it, or how long it takes to do it. The marketplace can decide whether they are employable, much as a restaurant can decide whether to hire someone whose food is bland and unappetizing.

I am not telling businesses that they can’t ship software full of bugs. You have Product Managers, they can decide whether it is more important to build new features or fix the old ones. Microsoft, do your thing! But business can’t make those decisions unless they have an accurate view of what the software actually does, of which parts are solid and which are brittle.

And I am not telling managers how to run projects. But I do expect that a certified manager understands the trade-offs when she chooses BDUF over Agile, Theory D over Theory P. She can do as she please, provided her eyes are wide open to the consequences.

Well, that’s my thought about certification. I’m all for it, don’t let anyone say otherwise. And like everyone else, I want it to reflect what my experience tells me is important about software development in the commercial environment, namely safety and a clear view of what works and what doesn’t.



This is obviously a pipe dream, the product of Dark Horse Café’s “Ruby’s Pride” French Press coffee. But what do you think? Would such a certification be useful?

Labels:

 

Friday, June 29, 2007
  Which theory fits the evidence?
There are two schools of thought about the practice of managing software development (the theory of managing software development is of little use to us because “the gap between theory and practice is narrower in theory than it is in practice”).

One school is that everything is fully deterministic in practice (“Theory D”). If development appears, from the outside, to be probabilistic, it is only because we haven’t discovered the “hidden variables” that fully determine the outcome of software development projects. And, since we are talking about development in practice, it is practical to measure the variables that determine the outcome such that we can predict that outcome in advance.

The other school of thought is that development is fully probabilistic in practice (“Theory P”), that there are no hidden variables that could be used to predict with certainty the outcome of a software development project. Theory P states that the time and effort required to measure all of the variables influencing a software development project precisely enough to predict the outcome with certainty and in advance exceeds the time and effort required develop the software.

Theory P does not mean that software development cannot be managed in such a way that the desired outcome is nearly certain: the flight of an airplane is fully probabilistic as it encounters atmospheric conditions, yet we have a huge industry built around the idea that airplanes arrive at their destinations and land on the runway as planned.

why do theory p and theory d matter?

Understanding whether software development follows the Theory D (fully deterministic) model or the Theory P (probabilistic) model helps us set our expectation for the relationship between what we plan and what transpires.

If we believe Theory D, we believe that it is possible and practical to plan software development entirely in advance. Therefore, when things do not go as planned, our first reaction is to either blame the planners for faulty planning or to blame the implementers for failing to carry out a reasonable plan. Believing in Theory D, we believe that we ought to have a plan that can be carried out to perfection.

Programming is not complicated because computers are complicated—it’s complicated because your requirements are complicated (even if you don’t know it yet).
Chris Ashton

If we believe Theory P, we believe that it is only possible and practical to plan some part of software development in advance. Therefore, when things do not go as planned, our first reaction is to embrace the new information and update our expectations. Believing in Theory P, we believe we ought to have a process for continually updating a plan that asymptotically approaches a description of reality as the project nears its conclusion.

belief drives behaviour

Our belief about which theory is true drives the way we manage software development projects in almost every way. Here are three examples: the way we manage software design, the way we manage time estimates, and the way we manage selecting people.

If extra time is required, people on Theory D projects work nights or weekends, or they cut testing time. They do this because their belief is that if a task takes too long, the fault lies with the estimate or with the worker carrying out the task, and by working overtime they can “make up for their fault.”

Theory D adherents believe you can design software in advance. They believe it is possible to collect all of the information needed about software’s requirements and the technical elements of its construction, such that you can fully specify how to build it before you start. In short, Theory D adherents believe in Big Design Up Front.

Theory P adherents believe that software can only partially be designed in advance. They believe that requirements suffer from observation, that the act of building software causes the requirements to change. Theory P adherents also believe that technical factors cannot be perfectly understood, that only the act of trying to build something with specific components will reveal all of the gotchas and who-knews associated with a chosen technology strategy. They believe that software design is an iterative process, starting with a best guess that is continually refined with experience.

Theory D adherents believe it is possible to estimate the amount of time required to develop software (in both the large and the small) with precision. This is partly a consequence of their belief that you can know the requirements and design in advance, and therefore you can plan the activities required without uncertainty. Theory D adherents do not plan to miss milestones. Theory D adherents do not, in fact, have a process around re-estimating tasks; instead, they have a mechanism for raising exceptions when something goes wrong.

Theory D adherents believe that the normal case for software projects is that tasks are completed within the time estimated. (If extra time is required, people on Theory D projects work nights or weekends, or they cut testing time. They do this because their belief is that if a task takes too long, the fault lies with the estimate or with the worker carrying out the task, and by working overtime they can “make up for their fault.” Theory D managers often “game” their workers by “negotiating” estimates downward in a cruel game of “guess the estimate I’m think of.”)




Critical Chain is an amazing book. The narrative form—a novella detailing a technical project team and their search for a way to manage an uncertain process—is a big win, it highlights the important ways that Critical Chain Project Management handles risks and uncertainty and makes it visible where everyone can manage it.

The section on estimating tasks alone is priceless. If you can’t afford a copy and your library doesn’t stock it, borrow mine. You must read this book if you participate in software development teams.

Theory P adherents believe that there are lies, damned lies, and software development estimates. This is partly a consequence of their lack of faith that the requirements are truly fixed and that the technology is fully understood. If you don’t know what you’re doing and how you’ll do it with precision, how can you know when it will be done? Theory P adherents build processes around re-estimating estimates, such as burndown charts and time-boxed iterations.

Theory P adherents are always fussing with an updated view of how long things will take. They talk about “velocity” or “effective vs. actual programmer-hours.” Theory P adherents believe that the normal case for software projects is that tasks are rarely completed exactly as estimated, but that as a project progresses, the aggregate variance from estimates falls.

Theory D adherents believe that the most important element of successful software development is planning. If a plan is properly constructed for the design and development of a software project, the actual implementation is virtually guaranteed. Theory D adherents invest most of their human capital in “architects” and “managers,” leaving little for “programmers.” They often have architects, senior developers, and other “valuable resources” involved in the early stages of projects and then moved to the early stage of other projects, leaving the team to implement their “vision.” They likewise believe that you can “parachute” rescuers into a troubled project. Since the plan is perfect, it is easy to jump in and be productive.

Theory D adherents believe in “architecture by proxy,” the belief that using frameworks, components, programming languages, libraries, or other golden bullets makes it possible to employ lesser talents to perform the implementation of software, since the difficult decisions have been made by the creators of the pre-packaged software. Theory D adherents also believe in “success by proxy,” the belief that using methodologies, practices, SDLCs, or other buzzwords makes it possible to employ lesser talents to perform the management of software development, since the difficult project management decisions have been made by the “thought leaders” who coined the buzzwords.

Theory P adherents believe that the most important element of successful software development is learning. They invest their human capital more evenly between implementers and architects, often blurring the lines to create a flatter technical structure and a more egalitarian decision-making environment. This is a consequence of the belief that learning is important: if you invest heavily in a few “smart” people, you have a very small learning surface exposed: there is only so much even very bright people can learn at one time. Whereas when the entire team meets a certain standard for competence, there is a very large learning surface exposed and the team is able to absorb more information.

Theory P adherents believe that there are lies, damned lies, and software development estimates.

They strongly prefer to have the same team work a single project from start to finish, believing that when a member moves on to another project, crucial knowledge moves on with them. They likewise abhor bringing new members onto a team late in a project, believing that the new people will need experience with the project to “get up to speed.”

Theory P adherents use frameworks (especially testing frameworks), but are skeptical of claims that the framework eliminates technical risk or the need for talented contributors. Theory P adherents, even Agilists, are skeptical of methodology claims as well. They do not believe that a deck of slides and a nicely bound book can capture the work required to learn how to develop software for a particular user community in a particular environment.

Theory D and Theory P adherents are easy to distinguish by their behaviour.

so which theory fits the evidence?

Which theory fits the evidence collected in sixty years of software development?

To date, Theory P is the clear winner on the evidence, and it’s not even close. Like any reasonable theory, it explains what we have observed to date and makes predictions that are tested empirically every day.

Theory D, on the other hand, is the overwhelming winner in the marketplace, and again it’s not even close. The vast majority of software development projects are managed according to Theory D, with large, heavyweight investments in design and planning in advance, very little tolerance for deviation from the plan, and a belief that good planning can make up for poor execution by contributors.

Does Theory D reflect reality? From the perspective of effective software development, I do not believe so. However, from the perspective of organizational culture, theory D is reality, and you ignore it at your peril.

Do not confuse Computer Science—the study of the properties of computing machines—with Software Development, the employment of humans to build computing machines. The relationship between Computer Science and Software Development parallels the relationship between Engineering, the hard science of the behaviour of constructions, and Project Management, the employment of humans to construct engineered artefacts.


(A portion of this essay originally appeared in What I’ve Learned from Sales, Part II: Wanna Bet?.

Update: D is for “D’oh! We should have gone with P!” and The Myth of Project Management, a SFW retelling of Project Management is Bollocks!.

Labels: ,

 

Friday, June 08, 2007
  Still failing, still learning
The good news: I’m still learning.
The bad news: …from failure.

This post officially announces that my side project (originally named cause & effect and later named certitude) is over.

What I wanted to achieve

For those of you who weren’t subjected to one of my enthusiastic rants, here was my Graham Question: Can we predict the outcome of a software development project with objective observation?




Although most businesspeople would soil their khakis if they had to ride a tank into action, Reis and Trout are right on when they compare the four major strategies of War—Offense, Defense, Flanking, and Guerrilla—to businesses, especially start ups. I rate this even higher than Crossing the Chasm for its ability to succinctly explain what growing companies have to do and how to do it to succeed.

After you’ve read what I have to say here about my failure, I invite you to read what Marketing Warfare has to say about Guerrilla Warfare (just three tactics to pursue!) and see if I'm a case in point.

I have always believed that the answer is Yes. And I manage projects that way. But I also strongly believe that the only way to prove that we have an objective understanding of an algorithm is to mechanize it, to write a piece of software that executes your algorithm.

So I set out to write a piece of software that, pure and simple, would look at a software development project and show you a traffic light: a green light would mean that the project looks like it’s on track, a yellow light would mean that the project needed help, and a red light would mean that there is no hope.

I won’t go into a lot more detail, mostly because this isn’t a story about teeming hordes of customers and me not being able to finish by deadline. It’s a story of inventing a great solution to a problem nobody cares about. But if you absolutely must visualize the application, think of something that gobbles up your MS project files, your bug tracking data, even your burn down spreadsheets, and belches out that traffic light when it’s done. That’s it.

So how and why did I fail?

Project management software is social software

Reason zero, just as Paul Graham warned, my age. Really. I remember how much code I could crank at age 22, and now that I am double the age, I write an order or magnitude less. I’d like to say that my code is that much better that it makes up for the lack of volume. That might be true if I start a project with a specific end in mind, but the act of experimenting, tinkering, and exploring benefits from being able to turn large amounts of code around in short amounts of time.

This is reason zero because I wanted to get it out of the way before explaining why I think I still would have failed if I were 22. But it’s still important to understand, because I might have known that I was going to fail two years ago, instead of today.

Reason one that I failed—and this is the most important reason I failed—is that I was trying to tackle a social problem with technology. This can work—ask any dating site zillionaire—but it is a very high-risk strategy for software. Not just for making money, but for what really matters, adoption.

Project management is a social problem.

Project management is a social problem. It is 99.5% about getting everyone who knows something about the state of the project to share what they know with everyone else. Getting all the relevant information is 99.5% of the problem, analyzing the information is 0.5% of the problem.

Joel likes to brag about how much trouble FogBugz goes to to make it easy to enter bugs, about how certain kinds of reports are not available to discourage punitive social behaviours like punishing developers who generate too many bugs. This is a strong hint that getting good information is all about managing people’s perception of the likelihood of punishment for telling the truth.

Sitting here typing this, I think the company who can do the best job of predicting the outcome of software development projects is Inkling Markets. That’s because their entire business is about finding a way for people to communicate what they really think of something, not just what they think other people want them to say about something. I think Todd Proebsting would agree.

This problem isn’t limited to dysfunctional environments where people cower in fear of saying anything except “Sir, Yes Sir” when told to change the laws of physics.

Lemons, damned lemons, it’s always lemons

Project management suffers from a real lemon problem. I quoted Bruce Schneier at length about this already, so this time I’ll explain things directly:

Most managers, especially those with limited experience shipping software on a predictable schedule, do not know how to correlate what they’re told about the project with the likelihood that the project will succeed.

Some information is valuable, some is junk. The problem is, managers “buy” information. They trade favours like letting you keep your job for information about how well you are doing your job. So there is a market for information just like a market for cars.

They also “sell” information, literally: they have to make a report or a presentation to their superiors, or to stake holders, or to their fellow founders at the YCombinator dinner.

When a manager cannot tell the difference between information that is useful for predicting the outcome of a project and information that is not useful for predicting the outcome of a project, she thinks about the next best thing: the “resale value” of the information with people one step removed from the project, like her own manager. So she values things like pretty PowerPoints about the architecture higher than finished pieces of functionality.

(This is why I have always sweated my heart out to give good presentations. My teams have depended on me being able to take good information and sell it upstream just as if it were CMM Level Five Buzzword-Compliant Junk).

Do managers further removed from a project always value pretty junk more than good, solid information? Not always, but often. And that’s enough for people to be pressured to give the bad information that sells to their manager, while hiding the good information that doesn’t sell. Exactly like the owners of good cars taking their treasures off the market.

Lemon and bay leaf crème brûlée

Why does junk information outsell good information? Nice PowerPoint isn’t a good explanation by itself: there are nice PowerPoints explaining Agile, but managers still prefer waterfall.

Consider a not so big design. Let’s call completing that design good information: we’ve done a good job finding out what’s really important for the project and making a design that emphasizes the way this project is unique, not the technology stack.

Now consider a typical technology design, emphasizing frameworks and technologies. Fully buzzword-compliant.

Which one sells better? The technology stack does. Why? Well, for starters, managers have been exposed to seventeen billion dollars worth of advertising talking about the benefits of technology stacks. Nobody is advertising the specific ways the not so big design helps the project. How could they? Those are specific to the project, that’s the whole point.

And managers are like anyone else, they compare what you are doing to successful projects they have seen in the past. Once again, the not so big design doesn’t have anything in common with other projects, but the technology stack does. (There are lots of failed projects with technology stacks, of course. But who cites those when bugging the team about whether they will use Hibernate as their ORM?)

How did this happen? How did things that have no correlation to the success of a project become more attractive than things that do?




Mmmmmm... Elegantly Easy Crème Brûlée... The title doesn’t lie, the recipes are easy: even I was able to make tasty desserts! For motivating a team or thanking your family for putting up with your devotion to your start up, dessert is always a good choice :-)

Quite simply, people have an incentive to look successful. So they imitate the outward appearances of successful projects. We have a really simple way of completing successful software projects: we put successful people on them. But we have a broken way of thinking about it: we don’t like to think of the people as being special, we think that what the people do is successful.

And by that logic, we can take anyone, have them do the same things as successful people, and our projects will succeed.

In a manager’s mind, the measure of whether information is good or not is, Does it measure whether people are doing the same things that successful people have done on projects I’ve been told were successful?

This is not the same thing as measuring whether the project is on its way to success at all. This measures the outward appearance of a project. Things that can be measured easily are rarely the most significant things. Behaviours that can be “gamed,” like how many hours a team is working, will be gamed.

And as above, even if a manger knows better, does her manager know better? If not, good information will be difficult to sell and she will be under a lot of pressure to serve Lemon Pie.

Off balance sheet transactions

There’s another important reason that projects have bad information: the best information is off balance sheet. That’s an expression meaning something a businessperson sweeps under the rug. Try Googling Off Balance Sheet Transaction. It’s never pretty.

In essence, project plans and reports never include the most important information about the likelihood of project success. Never. (I mean never in the same sense that Joel Spolsky means “nobody,” as in “fewer than 10,000,000 project plans”).

Let me give you a recent example. I was discussing a project with a certain manager in a client organization, and there was a rather linear series of releases. Her question was, Can’t we shorten the plan by looking at the dependencies and starting some releases in parallel with others?

Of course we could, but in doing so, we increased the project risk. Just as a single developer has a problem keeping multiple tasks in one brain, a team has the same problem: when working on several unrelated pieces of software at the same time, the individual developers may only work on one thing at a time, but the managers and the testers and especially the stake holders who are thinking about functionality and exercising change control are overloaded, so they will make poorer decisions.

And worse, although you might think that there are no dependencies, you only think that at the outset of a project because of assumptions you are making now, before you fully understand what you are getting into. All in all, there’s a reason Agile projects have a practice of working on iterations with single themes, and the reason is to reduce risk.

Does this mean that manager wasn’t making a reasonable trade off between risk and time? Maybe she was making a very reasonable choice. But let me ask you this: where on the plan would you see the risk component of the decision? If you were comparing Plan “A,” with the linear releases that finish in August, and Plan “B” with some parallel releases finishing in July, how do you decide which plan is better?

Right, you can’t see anything except the difference in dates, so you choose Plan B. The risk component of the plan is off balance sheet. Enjoy your lemon.

(And it is very hard for a manager’s people to explain, for the twentieth time, that it is a mistake to try to optimize a project by having everybody working at once, because that crystallizes certain architecture decisions too early, and haven’t you read any Eliyau Goldratt, whose novel The Goal explains what’s wrong with optimizing resource engagement when you should be managing project risk? The incentive is for them keep their own counsel and put their résumés up on the web. Such is life.)

A more striking example comes from another project where, for various reasons, there was nearly 100% turnover of the senior staff, and finally an outside firm was brought in to clean it up. Do you think there was a notation on the plan for staff turnover? That has to have a huge risk implication, but the plan where you use the same team from start to finish looks exactly like the plan where new people are brought in for each phase or iteration.

Why is risk off balance sheet? I think it’s because people have a blind spot for risk in projects. After the fact of a project, you can always do a postmortem and say, “if we had done this, and this, and if we didn’t do that, we would have succeeded.” So you blame yourself for poor execution.

There’s a culture of Boolean thinking about projects and plans. They worked or they didn’t. We’ll be done in July or August. Not “With Plan A, we’re 90% August, 8% September” and “With Plan B, we’re 12% July, 56% August, but 23% September and 7% October!”

Decisions that add risk to a project, such as excessive parallelization, or the use of unproved people, or the use of team who have never worked together in the past, or of forcing decisions prematurely, all of these things are not reflected in the plan.

(This is another reason there is pressure downwards on developer competence in many organizations. Consider two managers: the first staffs up quickly by selecting available developers, who may not be particularly good. In fact, they are less good then the existing team average. The second moves more cautiously, only adding to the team when the addition improves the average competency of the team. Do you think the second manager will keep their job long enough to finish a project? No, because the ticking time bomb of hiring certain types of developers is off balance sheet, but being understaffed is out there for everyone to criticize.)

I had this hope that through a kind of collaborative filtering I could have a piece of software notice that linear plans or plans involving hiring too quickly or whatever had less variance than parallel plans, and adjust the traffic light accordingly. Regardless of what people would say, it would shrug its shoulders and remorselessly remark on the actual historical averages.

Nice idea? No, it’s a dud. (Or at least, my execution was a dud!)

Somebody set up us the bomb

The second major reason I bombed is that I couldn’t find any buyers. Quite simply, the people who understand these principles are running successful projects. I know lots of people who agree that there is a problem and agree this would help, but they don’t think they need help.

I couldn’t find anyone holding their head in their hands, saying, “If only I could get the truth about our projects, I could open my budget and shower you with gold.” The people who were aware of the problems with project information were busy using decidedly low tech tools—like hiring good people and having daily meetings and estimating tasks in spreadsheets—to solve their problems.

And the people who could use some help quantifying the consequences of their choices—like managers who insist on calcifying decisions very early in a telephone book design document so that they can demonstrate progress in the form of an architecture—they don’t think there’s anything wrong with their approach.

Why is that? My conjecture is that people want help with things that fit their model of what’s important. Someone who uses skilled practitioners and XP thinks that limiting risk is important: that’s why they use two week iterations and merciless refactoring rather than up-front design.

Someone using a classical BDUF management approach thinks maximizing resource allocation and managing task dependencies is important, that’s why they spend all of their time looking at GANNT and PERT charts. When you try to sell either of them something, they want to know how it will fit the model they have in their head about how to succeed with software development, not why they should consider a new model.

And I’m not sure they’re wrong about what’s important to them, personally.

If one of those BDUF projects fails due to the architecture being a poor fit for what the team discovers is really important about the projects, or from poor decisions being made in haste at the beginning of the project, well… managers will say that the problem was with the people making poor decisions. Such managers are not shopping for tools to help them understand the trade-offs, because they do not believe they are making trade-offs.




Critical Chain is an amazing book. The narrative form—a novella detailing a technical project team and their search for a way to manage an uncertain process—is a big win, it highlights the important ways that Critical Chain Project Management handles risks and uncertainty and makes it visible where everyone can manage it.

The section on estimating tasks alone is priceless. If you can’t afford a copy and your library doesn’t stock it, borrow mine. You must read this book if you participate in software development teams.

Something I learned from selling Macintoshes back in the day is this: only make one sale. Convincing someone they have a problem is one sale. Convincing them you have the solution is another. And convincing them that today is the day to act is a third. If you have to do all three at the same time, you are doomed.

This is why experienced companies distinguish sales from marketing. The first two steps are marketing, the third is selling. When you are a new company, you don’t have the resources to market and sell. You have to work with an established pain point (eliminating the first hurdle), then use PR and limited marketing funds to get the word out that you have solved the problem (the second hurdle). You only have time and energy for the third sale, separating customers from their money.

But trying to convince managers that they’re doing it wrong (tactfully), then convince them that they want your product rather than some big rigid waterfall thing or three by five cards and XP, and then convince them that they should write a cheque today…

Bad idea. I should know, I discovered that I was trying to do exactly that.

Now you understand why I have used so much of this post to rant about the state of project management in software development. When you understand what most companies are doing and why, you understand what will sell, and why. And when I understood that the things my project were measuring and reporting were of little interest to the people who were my market, well...

The not so big business plan

I think there are lots of things that will sell, that do sell into this market. When you understand that this is a social problem, when you understand that most information is bad and that the incentives are to value bad information, and for managers to value activities that produce bad information over activities that produce good information, you can make something people will buy.

Like books, lectures, methodologies, and video training. If people have a social problem, a social solution is a natural fit.

There were interesting possibilities like selling this to BigCo for “process improvement.” But whenever I thought about such ideas, I noticed that the software wasn’t in there. I could have written a book and lectured on these ideas. I could have done a “search and replace” of agile for certitude.

A good business plan is one that is really specifically about you and your software. If your plan could be executed by someone else, or with a different project, it simply isn’t the right plan. And all variations of turning this into ConsultingWare were excellent ways of monetizing Reg Braithwaite the brand, but not ways of spreading the adoption of certitude, the product.

And so… and so I looked the sunken cost fallacy straight in they eye and said, no more. As much as I hate to lose, I have folded my project and I am sharing with you the important lessons I learned.

Lessons learned

Well, the good one is this: Paul Graham is right. If you phrase your venture as a question, you will walk away with something very valuable. I now know a lot more about project management and the culture of software development than I did when I started the project, and I wasn’t exactly a Spring Chicken. And set out to learn about Bayesian Networks and Supervised Classification, but I ended up learning about Unsupervised Clustering and Collaborative Filtering.

What a journey.

And there is the one I ignored for far too long: Always Be Selling. I can thank friends like Leila Boujnane, Sutha Kamal, and Erich Finkelstein for reminding me about this, incessantly. Asking people to “buy” your idea forces you to make something people want. That being said, I think there’s a right amount of “push.” You can’t invent by polling the market. Quickly now, jump in your time machine and go back to 1981 or so. How many people wanted a mouse for their computer? But yet… Always be selling!

(There’s a bunch of other stuff that is far more personal, and maybe some of it will appear one day in public, but I wanted to keep this post to essay length, so it’s almost entirely about products and markets.)

So, here I am. Older, wiser, and ready for life’s next adventure. It might be more consulting for BigCo. It might be joining a start up and helping someone else’s dream flower. I don’t know, but if you have an idea...

My brain is open.

—Pál Erdös

Labels: , ,

 

Tuesday, May 22, 2007
  The Not So Big Software Design
A little less than a decade ago, Sarah Susanka wrote a very provocative book, The Not So Big House. I found out about it one evening while watching PBS. I switched to Channel 17, and there was an interview with her in progress. My partner and I were enthralled. We had been struggling to purchase a new home from “tract” or “subdivision” builders, and we simply couldn’t find anything that spoke to us. In a few short moments, Sarah articulated exactly why we were so frustrated by the builders.

Sarah spoke about a culture of building homes to impress. Of cookie-cutter McMansions, where everything was big, but nothing was warm and inviting. I can give you a very practical example of this syndrome: drive through any subdivision these days. Measure the space between the houses. It’s pitifully small! The reason is that the builders are building the largest homes the possibly can on each lot.




The Not So Big House: A Blueprint for the Way We Really Live emphasizes personalization: building a home that fits the way its owners actually live, not just a cookie-cutter idea of what life might be like or what features will be most impressive when company comes to visit.

The design conflict described in the book—the tension between designing for actual use vs. designing for size and visual impact—is uncannily similar to the tension between Agile (build what you actually use and need) and Buzzwords (build everything you might need using the most impressive technology stack available).

That means that very little light can get into the sides of the houses, and you see this when you look at the floor plans: everything is organized around large picture windows in the front and rear of the house. And no wonder: there is nothing to see to either side except the brick or siding of your neighbour’s house just a few feet away.

Sarah’s solution to the problems of poorly designed homes is to take a given budget, and instead of buying the largest home for that price, purchase a smaller home but invest in features and details that customize the home for your needs.

Applying this “not so big” thinking to the problem of houses squeezed together in a subdivision, you can try to place a smaller home on the lot and invest the construction savings in windows on three sides instead of having nearly all the windows on just the front and the back.

Everything in Sarah’s philosophy is driven by the owners’ actual lifestyle, not some imagined lifestyle that never comes to pass. So… unless you are a competitive dancer, Sarah is not going to design an impressive ballroom for your home. On a more practical note, she spends quite a bit of time discussing the merits of doing away with the formal dining room.

Very few people want to have company over for dinner in their kitchen, so Sarah often designs an eating area separated from the kitchen by sliding doors. You have an eat-in for everyday dining and a formal spot when you need to throw a dinner party.

This kind of thing is not free: sliding doors are expensive, and that’s why very few “tract” homes have them, even very expensive tract homes. But if you want a home that works, you make the choice to have fewer square feet but make those square feet work for you every day.

Lemons

The problem with tract houses can be summed up in a phrase: the builders are selling you lemons. I hope Bruce Schneier forgives me quoting wholesale from his excellent article about security problems:

In 1970, American economist George Akerlof wrote a paper called “The Market for Lemons,” which established asymmetrical information theory. He eventually won a Nobel Prize for his work, which looks at markets where the seller knows a lot more about the product than the buyer.

Akerlof illustrated his ideas with a used car market. A used car market includes both good cars and lousy ones (lemons). The seller knows which is which, but the buyer can’t tell the difference — at least until he’s made his purchase. I’ll spare you the math, but what ends up happening is that the buyer bases his purchase price on the value of a used car of average quality.

This means that the best cars don’t get sold; their prices are too high. Which means that the owners of these best cars don’t put their cars on the market. And then this starts spiraling. The removal of the good cars from the market reduces the average price buyers are willing to pay, and then the very good cars no longer sell, and disappear from the market. And then the good cars, and so on until only the lemons are left.

In a market where the seller has more information about the product than the buyer, bad products can drive the good ones out of the market.

Now don’t think about the house builders as bad people trying to sell you a bad house.

In the case of new homes, the bottom line is that if most buyers of homes cannot tell the difference between a home that will suit their lifestyle and one that will not, the builder has very little choice but to offer homes that offer the superficial features (like gross square footage) that will sway people into buying.

The entire problem is centered around the fact that the average home buyer is unable to tell the difference between a good house and a bad house, so they settle for superficial distinctions.

Building Better, Not Buzzwordier

Does this sound familiar? There are two obvious blog posts here, one about the fact that the average employer cannot tell the difference between good programmers and bad. The other about the average buyer of custom software. Since someone has already noted the similarity between used cars and programmers, let’s look at the similarity between houses and custom software projects.

I recently had a chance to review an architecture design for a custom software project. The designer was given a telephone book sized specification written by the client and asked to put together a high-level architecture plan.

Now right away, I want to say this is a tough spot to be in: it’s all well and good to talk about customizing things for clients, but you really need to talk to them if you want a shot at doing a good job. Whether you are a proponent of Agile or of BDUF, I think you will agree that no amount of documentation can replace communication, ever.

Any ways, I saw right way that the document was… What is the phrase?… Oh yes, as Richard Feynman would say, it was no damn good. It was a lemon.

It is very poor form to criticize this person’s work after establishing that they had very little chance of doing well. So this is my disclaimer: I am writing to talk about why these circumstances conspire to produce a lemon of an architecture plan. Got it? Good person, bad circumstances.

So what was wrong with the design? Quite simply, there was no client in it.

There was a technology stack, there were buzzwords, there was a very popular programming language, there even were some quasi-open source components. Lots to like, and difficult to criticize. Think about how such a conversation might go between two lemon sellers: “Why are you specifying Java, C# is the best language!” Or perhaps, “BizTalk?!?! No way, you want open standards, not lock-in!”

But there was no client in it. Tract houses are designed for the features that all inexperienced clients want to buy, making owners and tract houses interchangeable.1

And this design articulated the features that inexperienced clients (and inexperienced software designers) like to think about. These kinds of designs and clients are equally interchangeable.

Where is the client?

What I saw was a design with such broad strokes (“Database,” “ORM,” “Workflow,” “Templates”) that it could have been presented to hundreds of different clients without change. Now obviously, hundreds of clients need databases and what-not. So the design wasn’t wrong in the sense that none of the decisions it articulated were bad decisions.

But let’s stop for a moment and compare that architecture design to a home design. Imagine you hire an architect. They put together a preliminary design, a kind of sketch, for your consideration. They call you into their office for a presentation. The lights are lowered and the presentation begins.

The Consulting Engineer speaks. “Concrete foundation!” She says. Next slide. “Wood frame.” Next slide. “Brick exterior.” You are getting the same treatment as the clients looking at a design that goes into detail about the technology stack (“Java,” “Oracle,” “Hibernate,” “BizTalk”).

Or maybe the Consulting Engineer sits down and the Junior Architect takes over “Four bedrooms. Maybe five.” Next slide. “En suite bathroom.” Next slide. These things are all decisions that must be made, but they have little or no connection to the client’s needs.

Isn't it obvious that a well-designed home with vinyl siding is a better choice than a poorly designed home clad in brick? But in the absence of better information, clients are forced to pick the brick over siding instead of choosing whether to have a formal dining room or whether to separate the eat-in from the kitchen with sliding doors.

And obviously two parents and three children want at least four bedrooms. But there is no talk of whether the master bedroom is on the main floor, or whether the architect has chosen to place the play room adjacent to the children’s bedrooms upstairs where the children can play without disturbing the adults or whether to place it downstairs where it can be seen from the kitchen.

It’s easy to see that the exterior of the house and the number of bedrooms are superficialities: To get at the important details, you have to ask a simple question: How is this different than what every other client is getting?

The really important architectural decisions are the ones that address how each client is unique, not what all clients have in common.

Better Software Architecture

Designing software is not easy. And truthfully, our environment makes it difficult, because our clients are not knowledgeable enough to distinguish the not-so-big applications (“Domain-specific languages,” “Agile development”) from the McMansion applications (“Industry-standard platform!” “Detailed Specifications!!”)

In the context of software developed for clients, good software architecture is, at its heart, architecture that is specific, not general. It isn’t all high-level abstractions that could apply to any client, it’s specific choices that address specific problems for that specific client.

It is easy to say that the cure for the general architecture is to add detail. If the lemon design requires five slides, flesh the design out into fifteen slides. If that isn’t specific enough, triple the length again and go to forty-five slides.

This would be the equivalent of taking the builder’s floor plan of a McMansion and filling in the exact dimensions. Or perhaps selecting the kitchen finishes and whether the shower fixture will be pressure-balanced or not.

Adding detail makes a design more specific, but it only makes it specific for a client if the choices expressed address the most important needs of the client. Naturally every new home buyer has a preference with respect to kitchen cabinetry. But does expressing that decision really reflect a deep understanding of the client’s lifestyle?

When you look at a high-level design for a client, it should be obvious at a glance that the design addresses specific needs. Someone who doesn’t know the client may need an explanation—if you looked at a home design with the master bedroom on the ground floor, would you know instantly that the clients have teen-aged children?—but if you know a little something about the client, you ought to be able to literally see the client in the design.

This should be true at each level of detail. It should never be necessary to drill down into the details to understand how the design solves the client’s specific problems. If you are looking at a five-slide high-level design, it should convey the one or two most important ways the software will solve the client’s most important and pervasive needs.

When you drill down to detail requiring forty-five slides, you should see solutions to problems that are a ninth as important as the solutions evident in the five-slide presentation.

Like Sarah’s approach, this type of design has a cost. When you only have five slides, using one slide to address a client’s specific problem means foregoing a slide full of buzzwords that impress the less-knowledgeable client.

I wish I could tell you that this will outshine the McMansion presentation from the big consulting firm full of buzzwords and no attention to the client. But it will not: most clients will buy the idea that their needs are not-so-unique, and if what they need doesn’t fit the architecture, they must change to adopt “best practices.”

But for the serious practitioner, good design is more important than technology stacks and buzzwords. More important than size and impressiveness. It may be “not so big.”

But it is better.



  1. In all my looking at tract houses, I saw just two departures from the norm: one builder offered a two storey model with a master bedroom on the ground floor so that when they children moved out and the owners aged they wouldn’t need to go upstairs (bungalows solve that problem as well, but you need a much larger lot). Another style, “New Urbanism,” put garages back behind the house where they belong. But 99% of them were just variations on the same theme.

Labels:

 

Wednesday, May 09, 2007
  Hard-Core Concurrency Considerations
Kevin Greer responded to What Haskell teaches us about writing Enterprise-scale software with some insightful emails about the pros and cons of using purely functional data structures for managing concurrency in a multi-processor environment. I’ve reproduced them (with his permission) below.

Now, your first reaction might be, “Ah, Reg is not nearly as smart as he thinks he is.”

If you feel like giving me some credit, you can keep in mind that I was not writing the definitive essay on designing concurrent software, just pointing out some interesting overlaps between what I consider to be the most academic programming language around and the most practical requirements in business applications.

But there’s something more important to take from this.

The original post was a response to someone asking whether there was any value to the stuff you read on programming.reddit.com. His query was whether reading about Haskell was practical. My response was, yes it is, and here are some examples of where functional data structures have an analogue in concurrent data structures. My thesis (that’s a two dollar word for “point”) was that many “impractical” things have a lot to teach us about things we will encounter in the pragmatic “real world.”




Java Concurrency in Practice is written by the people who actually designed and implemented the concurrency features in Java 5 and 6. If you are writing Java programs for two, four, eight, or more cores and CPUs (and isn’t that practically all server-side Java development?), owning and reading this book should be the very next step in your professional development.



Of course, most people read the post and thought, “Cool, some neat stuff about concurrency.” Nothing wrong with that. If you value tips on concurrent programming (and I certainly do), you’ll find Kevin’s emails very insightful.

And if you are still wondering whether you should look at “impractical” and “academic” material like Haskell, Erlang, or other things not being promoted by MegloSoft, consider that one of the papers Kevin cites describes a data structure running on a 768 CPU system. Is this in a University lab somewhere? No, it is in a company that promotes itself as an Enterprise-Scale Java company.

You simply can’t assume that everything the industry tells you to learn is everything you need. Or that any one source (Cool! Raganwald has a new post on Lazy Evaluation) has the definitive answer.

You need to commit to life-long learning to be a software developer. Some of that learning is straightforward MSDN Magazine stuff, simple updates to things you already know. And some of that learning is a little further out on the curve.

Without further ado, here is one of the most comprehensive technical replies to a post on my blog I’ve received to date.

Concurrency Considerations

Hi Reg,

I was just reading your article on concurrent collections and have a few comments:

  1. Just because readers do not update the structure doesn’t mean that they don’t need to synchronize. This belief is a common source of concurrency bugs on multi-processor systems.

    Unless you synchronize on the root of your data-structure (or declare it as volatile), you can’t be sure that your cache doesn’t have a stale version of the data (which may have been updated by another processor). You don’t need to synchronize for the entire life of the method, as you would by declaring the method synchronized, but you still need to synchronize on the memory access. You hold the lock for a shorter period of time, thus allowing for better concurrency, but you still have to sync.

    If you fail to synchronize your memory (or declare it as volatile), then your code will work correctly on a single CPU machine but will fail when you add more CPU’s (it will work on multi-core machines provided that the cores share the same cache). If you have a recursive datastructure (like a tree) then you will need to actually synchronize on each level/node, unless of course you use a purely functional data-structure, in which case, you’ll only need to synchronize on the root.

    Given that you need to make a quick sync anyway, this approach is not much better than just using a ReadWrite-lock (it is slightly better because you can start updating before the readers finish reading (not a big deal for get()’s but potentially a big deal for iterators()), but then again updates are more expensive because of the copying).

  2. I don’t think that you should be using Haskell as a model of “Enterprise-scale” anything. “Enterprise-scale software” usually entails “Enterprise-scale hardware” which means, among other things, multiple-CPU’s. The problem is that Haskell purely-functional model doesn’t support multiple-CPU’s (it’s true, check the literature (except for specialized pipelined architectures, but not in the general case)).

    The whole processing “state” is essentially one large purely-functional data-structure. The problem of synchronizing your data-structure appears to be eliminated, but it has really just been moved up to the top-level “state” structure (monad), where the problem is actually compounded. This is worse, because not only would you need to lock your structure in order to make an update, but you would in fact need to lock “the whole world”.

    Some people will advocate launching one process per CPU and then using message passing to communicate between them. This is just a very inefficient (many orders of magnitude slower) way of pushing the synchronization issue off onto the OS/Network-stack. (Q: What do multi-core systems mean for the future viability of Haskell?)

  3. You forgot to mention the common technique of using “striping” to improve concurrency. Basically, what you do is create multiple sub-datastructures which each contain only a sub-set of the data. You can partition the data based on the hash of the key being stored. You then wrap the whole thing up so that it still has the same interface as the single data-structure.

    The advantage of this approach is that when you use a ReadWrite lock you only need to lock a small portion of the data-structure, rather than the whole thing. This allows multiple updates to be performed in parallel. This is how Java’s concurrent collections work. See: ConcurrentHashMap: Java creates 16 sub-structures by default but you can increase the number when even more concurrency is required.

  4. Have a look at Azul’s non-blocking HashMap implementation. They can do 1.2 billion hashMap operations per second (with 1% of 12 million/sec of those being updates) on a 768 cpu machine (the standard Java ConcurrentHashMap still does half a billion ops/sec which isn’t bad either) . This shows how scalable non-functional methods can be.

    I’ve never read of any Haskell or other purely-functional implementation being able to scale to anywhere near these numbers.




There’s another entire world of concurrency control, the world of independent actors communicating with fault-tolerant protocols. The world of Erlang. You can pre-order the most hotly anticipated book on the subject, Programming Erlang: Software for a Concurrent World, the definitive reference from the language's creator, Joe Armstrong.



Summary: Using a purely-functional data-structure does make it easier to support multiple readers, but you still need to sync briefly at the “top” of the structure. Striping has the added advantage of also supporting multiple-writers as well, and in practice, this is a much bigger win. Haskell is inherently limited to a single CPU, which doesn’t match modern hardware; especially of the “enterprise” class. Shared-memory models of concurrency have demonstrated much better performance than purely-functional models.

Best Regards,

Kevin Greer

p.s. I've actually implemented a b-tree structure for very large data-sets and chose to use a purely-functional data-structure myself. My reason for doing so was that I have some very long-running read operations (~8 hours) and I obviously can't afford a ReadWrite lock that's going to starve writer threads for that long.

Another nice advantage of using purely-functional data-structures is that they make it easy to implement temporal-databases that let you perform time-dependent queries.

I just wanted to point out that if all you have is quick-reads then purely-functional is no different than a simple ReadWrite lock and that a super-pure implementation, as with Haskell, doesn't scale to multiple-CPU's or to highly concurrent updates. However, it can be used to good effect in combination with striping or other techniques.

(A tangential note: Java's GC is getting so good in recent releases that P-F data-structures are becoming much more efficient (given that P-F generates more garbage).)

p.p.s. One more advantage of functional data-structures (the largest advantage for me actually):

They map well to log(or journal)-based storage. Functional data-structures never update old data, but instead, just create new versions. If your data-structure uses disk-based storage then this means that you never need to go back and overwrite your file; you only need to append data to the end of the file. This has two advantages:

  1. This works well with the physical characteristics of hard-disks. They have incredibly high transfer rates (10’s of Megs/sec) but very slow seek times (~ 200 seeks/sec if 5ms access time). If you are adding say 1k objects to a data-structure which requires 2 updates on average to a file then you’re looking at about 100 updates per second. If on the other-hand you write all of these updates to the end of the file then you’re looking at say 20,000 updates per second!

  2. You can’t corrupt old data with interrupted writes. The old data is always still there. The only place that a corruption occur is at the end of the file, in which case you just scan backwards until you find the previous good head of your data-structure.



This is fantastic stuff to share, thanks, Kevin!

What other tips can readers contribute for people building highly concurrent software (besides the frequent use of JProbe Threadalyzer, of course)? What online resources (papers, articles, books) do readers recommend for the professional developer?

Labels: , ,

 

Sunday, April 29, 2007
  Writing programs for people to read
Programs must be written for people to read, and only incidentally for machines to execute.
Abelson & Sussman, Structure and Interpretation of Computer Programs


This is about writing programs in a style that favours human comprehension over the convenience of the machine.1

Norbert Winklareth recently raised the question of minimizing the semantic distance between the program as written and the solution to the problem as conceived by the programmer.

Norbert was talking about comparing the capabilities of programming languages, but the idea of semantic distance is also useful for comparing programs to each other. Although this is not the entirety of writing good programs, let’s examine this idea in more detail.

Indeed, let’s look at one very simple, very powerful, way of writing programs that are as semantically close to the solution in the mind of the programmer.

Code that resembles its result

A template is a blueprint for describing the result you want, where instead of embedding data inside executable code, you turn things inside out and embed executable code inside data.

Templates are very popular in programs that generate markup:


<HTML>
<HEAD>
<TITLE>Hello World</TITLE>
</HEAD>

<BODY>
Hello, Example. Today's date and time is <%=Now()%>.
</BODY>
</HTML>


It’s obvious what result you want, much more obvious than if you tried the following:


page = Page();
head = new Head();
title = new Title();
title.setText("Hello World")
head.add(title);
page.add(head);
body = new Body();
preamble = new StringBuffer();
preamble.append("Hello, Example. Today's date and time is ");
preamble.append(Time.now().toString());
preamble.append(".");
body.add(preamble.toString());
page.add(body);


This code produces its results as a side effect of its execution. The code itself doesn’t directly describe the result, whereas the first example directly describes the result we wish to generate.

Sometimes you need to generate the result as a side effect of the code. You needn’t write code as opaque as the answer above, instead you can organize your code so that its form resembles the form of the result you are generating, such as this Scriptaculous code:

element = Builder.node('div',{id:'ghosttrain'},[
Builder.node('div',{className:'controls',style:'font-size:11px'},[
Builder.node('h1','Ghost Train'),
"testtext", 2, 3, 4,
Builder.node('ul',[
Builder.node('li',{className:'active', onclick:'test()'},'Record')
]),
]),
]);


That produces this HTML:

<div id="ghosttrain">
<div class="controls" style="font-size:11px">
<h1>Ghost Train</h1>
testtext234
<ul>
<li class="active" onclick="test()">Record</li>
</ul>
</div>
</div>


Just like the template example, you don’t need to run a simulation in your head to try to figure out what the code produces.

Is this just cancer of the semicolon?

What’s the difference between these two code samples?


preamble = new StringBuffer();
preamble.append("Hello, Example. Today's date and time is ");
preamble.append(Time.now().toString());
preamble.append(".");


…and…


"Hello, Example. Today's date and time is #{Time.now}."


Is the second just syntactic sugar for the first? No. It’s more than just syntactic sugar. People have a habit of saying “syntactic sugar” in a dismissive way. It’s another argument that since an underlying language is Turing Equivalent, there is no need for a particular language feature.

Not all language features are just syntactic sugar. True syntactic sugar features are local features: you can replace the feature with some other equivalent code without having to change a bunch of stuff elsewhere.

Lazy evaluation and garbage collected memory management are not syntactic sugar: they require wholesale changes to the underlying model of computation to work. The abbreviated for loop in Java 1.5 is syntactic sugar: you can translate each loop into the equivalent old-style iterator loop without any additional support. For that matter, Java enums are also syntactic sugar, they’re a way to write the Type Safe Enum idiom with less boilerplate.

Okay, non-local features are not syntactic sugar. When is a local feature “just” syntactic sugar and when is it something more than that?

Let’s compare these two language features. Consider this Smalltalk code:


window
position: 80@80;
extent: 320@90;
backcolor: Color blue;
caption: 'My Blue Test Window'.


This shows a series of “cascading messages” to the same receiver. It saves you having to type the word “window” again, and it is a lot easier to read, because it lets you group messages that obviously belong together.

And earlier, we saw:


"Hello, Example. Today's date and time is #{Time.now}."


This is String Interpolation.2 It’s an “abbreviation” for a longer sequence of appends onto a StringBuffer.

The difference between these two trivial cases is that the first example doesn’t change your mental model of what’s going on when you read the code: you’re simply sending a bunch of messages to the window.




The Ruby Way is the perfect second Ruby book for serious programmers. The Ruby Way contains more than four hundred examples explaining how to do everything from distribute Ruby with Rinda to dynamic programming techniques just like these.


The second example is different in a very important way: honestly, when you read the second example, do you think “Aha! Start with a string, get Time.now(), append it to that string, then append a period”?

No way! You think “A String with Time.now() stuck in it.”

That’s a huge difference mentally, it’s not just shorter, it’s semantically closer to your mental model of the result you’re trying to achieve.

Wrapping up code that resembles its result

In summary, one way to write code that is comprehensible is to make sure that the form of the code matches the data the code generates. This is a very general principle, it can be found in web templates (like PHP and ASP pages), markup builder libraries, and even String or List Interpolation.

Features that support this style of writing code are more than simple syntactic sugar, because they alter the reader’s mental model, lowering the semantic distance between the code and the code’s result.

Bonus! Order now, and we’ll throw in these free Domain Specific Languages!

We saw that organizing code so that it resembles the result it generates lowers the semantic distance between the code and the solution. We saw two ways to do this: we can use templates or interpolation (if our language permits interpolation) to produce data, and where templates won’t work we can structure our code to resemble its result.

Well, we needn’t stop there. Domain-specific languages can provide this exact benefit.

The general purpose of a DSL is to write programs, or parts of programs, where the form of the code matches the mental model of domain experts or programmers. And here is one specific use for a DSL: to write code that closely matches the result it generates.

List Comprehensions model lists after mathematical notation. list { [x, y, x * y] }.given(:x => 1..12, :y => 1..12) directly describes a list of multiplicands and results.

(19|20)\d\d[- /.](0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01]) is a regular expression that matches dates.3 Do you think it is hard to read? What would happen if you “compile” that expression into procedural code with side effects? Which program would be easier to understand, debug, and modify?

There are many valuable uses for DSLs. One of them is to create programs that closely resemble the results achieve, such as the HTML builder we see earlier, list comprehensions, and regular expressions. DSLs can increase human comprehension by representing the desired result directly.

And if you call in the next fifteen minutes, you’ll membership in the Pattern Matching family at no extra cost!

There’s another significant opportunity for writing code that increases human comprehension. We saw how to write code where the form of the code resembles its result. You can also write code where the form of the code resembles the data it consumes.

In ML, patterns allow us to make our functions resemble the different values they consume:

fun factorial 0 = 1
| factorial n = n * factorial (n - 1)


If you are not familiar with pattern matching, and especially with how languages like ML and Haskell combine patterns with their type checking system, maybe today is the day to spend a little time looking into this powerful idea for making comprehensible programs.



  1. In my personal experience, “favouring human comprehension” does not mean favouring readability over writability—in order to write a program that solves a problem, I have to understand the solution, so comprehensibility applies to the act of composing and of reading programs by humans.

    Now about machines: In this day and age, “The convenience of the machine” is often a way of saying, “the convenience of the layer of abstraction just below your program.” In a sense, every layer of abstraction above the silicon is a kind of virtual machine.

    “Remember, it’s all software, it just depends on when you crystallize it.”—Alan Kay, as quoted by Andy Hertzfeld

  2. List Interpolation actually predates String Interpolation, but most people recognize String Interpolation. Lispers have a little thing called a quasiquote or backquote that builds lists or vectors in a template form.

  3. Early feedback suggested this is a poor example of a regular expression, because it looks obtuse. I could have selected something much simpler, however I wanted something that really would be incredibly obtuse if you tried to code it procedurally. (Not counting using built-in library functions for parsing dates, of course).

    The point is that this regex is readable, and you can see out all of the special cases, right where they belong in their place in the pattern. If readers can post some imperative code that does the same thing in a more readable form, that would be a very interesting lesson.

Labels:

 

Wednesday, April 04, 2007
  Don't have a COW, man? What Haskell teaches us about writing Enterprise-scale software
Berlin Brown asked:
I read programming.reddit religiously but I rarely see something that could be used in a non-startup environment. Am I wrong, or are you considering deploying a haskell enterprise web application? And if the stuff discussed isn’t relevant for the next 5 years (i.e. an erlang based webapp) will it ever be relevant?

Yes, I use what I read on programming.reddit.com in my day job. That’s one of the reasons I have this day job: it’s part of what I do to sift through all of the cool stuff and find the things that are practical today.

Since you mentioned Haskell:

Consider a multi-threaded application with shared memory, like a really big web server that has some big shared collection of things in memory. From time to time you add things to the big collection, from time to time you remove them.

Pessimism and coarse-grained locking

One way to arbitrate multiple threads is to have one copy of the collection with strict locking protocols that apply to its coarse-grained operations like add, remove, and fetch.

In a language like Java, you can synchonize those methods and you’re done. You have just implemented mutex locking: only one thread can use the collection at a time. If one thread is fetching something from the collection, all other threads must wait, even if all they want to do is fetch things as well.

That sucks tbng qvpx, especially if you do lots of reading: why should thread 546 have to wait to fetch something just because thread 532 is currently fetching something?1

Read and write locks

The next improvement is to have two kinds of locks: read locks and write locks. Two or more threads can lock the collection for reading, but if any thread locks the collection for writing, all of the other threads have to wait until it is done.

This works for about 17 clock ticks, and then you fix the bug by adding a new rule: if a thread wants a write lock but one or more threads have read locks, it has to wait, but any other threads that want read locks can’t have them. Even though the only threads with locks have read locks, they still have to wait.

The thread waiting to write gets a kind of pending write lock that blocks all other threads from taking out new locks. And then you fix the next bug by saying that all threads waiting wait in a priority queue so that the read threads aren’t starved by write threads and the write threads aren’t starved by read threads.



Purely Functional Data Structures takes you step by step through the design and implementation of copy on write collections. These collections can be used in purely functional languages, but they are just as useful in multi-paradigm languages like Java, Ruby, or Python handling multiple threads and performance optimization. The author’s thesis is available on line for free.

You now have a system that is pretty fast a long as you don’t write things very often. For example, you could build a fairly nice cache using read-write locking provided it is tuned so that you get lots of hits and only rarely have to drop things from the cache or add things to the cache. But if you’re doing something like maintaining a big index in memory where you have to make lots of updates, the writes will slow everything down.

These kinds of locking protocols are called pessimistic protocols: you assume bad things will happen and prevent them from happening up front by blocking threads from executing until it is safe to proceed.

Multi-version concurrency control

Another way to arbitrate multiple threads is to make copies of the collection whenever you perform an update.2 You maintain multiple versions of the collection. When a thread needs the collection, it grabs the latest copy. When it wants to remove or add elements, it writes a new copy without disturbing an existing copy.

This works really well in that threads that only want to read are never blocked. They always run at full speed, even if another thread is in the middle of an update. Hand-waving over how you figure out the whole “latest copy” thing, this scheme doesn’t work so well for threads that write.

The problem is one of serialization: this word means, if some set of operations takes place on the collection, the result must be the same as if the operations were conducted one at a time on the collection. There is no guarantee of the order of the operations, just that the result is the same as if they had been carried out in some order.

Let’s use an example. Say our collection is a Map. It starts empty:

{ }


Operation A adds an element:

{...a: "A"...}


As does operation B:

{...b: "B"...}


And operation C:

{...c: "C"...}


If we start with an empty hash and perform all three operations, the result should be { a: "A", b: "B", c: "C" }, exactly the same result as if each operation were performed serially, one after the other. But what happens if each operation is performed by a thread that grabs the initial copy, {} and writes its result back to the collection? Something called a race condition: the result will be { a: "A" }, { b: "B" }, or { c: "C" }, with the “winner” being the last one to write its result.

We can fix this problem in a couple of ways. One way is to keep the versions so that reading threads work at full speed, but use mutexes for write locks so that only one thread can write at a time. That’s simple, and if you can figure out a cheap way to make copies, works pretty well.

That’s your pessimistic protocol again (threads that write have to wait on other threads that write), but it’s a huge win for threads that read.

Making this work is tricky. Copying the entire thing is expensive, so you need to do clever tricks where you only copy the things that change and share the things that don’t change. And you can get a big, big win if you can avoid write conflicts by arbitrating conflict at a finer grain. For example, a HashMap uses a set of linked lists. If two different threads write to different “buckets,” you can merge their results rather than rolling one back and starting again.



One of the best books ever written on the subject of implementing fault tolerant concurrency (either on a single system or a distributed network) is Concurrency Control and Recovery in Database Systems.

Don’t be fooled by the word “database”—the techniques described are just as useful for implementing distributed algorithms like MapReduce, concurrent data structures like high-performance collections, or for building multi-threaded communication systems based on queues.

Like all classics, it’s also available online for free.



There is a lot of extra overhead for this if a thread wants to write while another thread is reading a version, so it is only a big win if writes are fairly rare. Remember, one of the big wins is that reads never wait on writes because they work with immutable versions of the collection.

Depending on how many threads you have, what kinds of operations are most likely, and other factors, this kind of system can be orders of magnitude faster than coarse-grained pessimistic locking.

Sometimes you want a slightly different protocol. The aforementioned is often called single write, many reads. It requires threads that plan on writing to know in advance they need to write. But for something like a cache, you don’t know you need to write until you miss the cache. And then it’s too late to get a write lock.

Optimism and many writes, many reads

The easiest way to avoid having to pre-declare whether a thread is a reader or a writer is by letting all of the threads do as they please. They all get the latest version and chug happily along.

When they are finished, if they never executed an add or remove we let go of the copy of the collection and we’re done. If a thread wants to write, it makes a copy as above and writes to its copy. But it doesn’t have to grab a lock while it is writing, so writes don’t wait on other writes.

Now, if a thread has executed a write (an add or remove), when it is done we check the result to see if it violates serializability.

For example, we can number our versions. Let’s say that {} is version 0. The first thread to finish, let’s say it’s the thread performing operation B, creates its result: { b: "B" }. Now it checks the collection to see if anyone has updated it since B read the collection. The collection is still at version 0, so B can write its result. B writes { b: "B" } to the collection and calls it version 1.

Next A finishes and notices that the collection is at version 1. This is a problem, because A is working with an updated version 0, so it has to throw out its work, fetch version 1, and try again. This is exactly the same thing as using a source code control system like Subversion to resolve conflicts.

This many reads, many writes strategy is called an optimistic protocol because you do work in the hope that nothing will cause you to throw it out and try again. It’s a big win if writes do happen at the same time, but they rarely conflict.

For example, if you have a good strategy for merging writes, this is huge.

So what?

Well, it would be great if you didn’t have to reinvent the wheel and have to work out all of the implications when you want to make a really fast shared collection in a multi-threaded environment. What you want is someone to share a wealth of experience about how to make really fast copies of things by only changing the little bits that you change instead of the whole thing, and so forth.

I like languages which say, “No, you don't want to write it the way you’re thinking. There’s a vastly better way to solve this whole class of problems.” Me: brain explodes

Eric M. Kidd


Where do you go for that kind of information? How about to people who spend all day thinking about collections that cannot change because their programming languages are purely functional?

Yes, what I’ve just described is exactly how languages like Haskell implement mutable collections like dictionaries and lists. In purely functional languages, collections never change. Adding something to a collection is really creating a new collection with an extra element. This is the exact same implementation that we need for building optimistically locked collections in a multi-threaded environment!

Haskell teaches us extremely useful techniques for writing Enterprise-scale software.

And more techniques: Hard-core concurrency considerations



  1. Now, you might be saying, “what a waste, this is like locking a table in a database when we should be locking rows.” But if you look at the database closely, it does lock the table when you perform certain actions like deleting a row. Or it does something more complicated, and now that you’ve read the entire post, you know what it really does.

  2. Making Copies on Writes is called copy on write semantics, or COW for short. Chew on that for a while.

Labels: , ,

 

Friday, March 16, 2007
  An Approach to Composing Domain-Specific Languages in Ruby
Whoa! This looks like a long post with a lot of code snippets. Am I going to have to do a lot of hard thinking, or can I just relax and enjoy a good rambling essay?

This is a bit long, probably (like all my posts) 200% longer than necessary. If you just want to see a neat DSL that implements Haskell and Python’s List Comprehensions written in Ruby, just scroll to the bottom.

If I do bother to read it all, will I learn some neat hacks?

Yes, but you could learn them just as well by reading the source code directly.

So the benefit of reading the whole thing is...?

The List Comprehensions DSL is the what. The source code is the how. But the essay is the why.

Reading the whole thing will take you through some of the pitfalls of writing DSLs and explain why I chose my particular workarounds.

Furthermore, there are a lot of corners in Ruby where you can easily assume that things work one way, but really they don’t. If you actually try the snippets on your computer, you’ll have a much better chance of remembering where the pitfalls are. That’s why I tried to give a working example for every point, rather than just explaining things in words.

Of course, if you have no interest in writing your own Domain Specific Languages in Ruby just yet... this isn’t meant as a popular essay, rather it’s meant as an experience report for fellow practitioners. And honestly, there’s a world market for maybe five tools for writing DSLs in Ruby.

But since you’re here, the essay starts below!



An Approach to Composing Domain-Specific Languages in Ruby

Ruby is often touted as a good language for writing Domain-Specific Languages (“DSLs”). There are a few arguments in favour of writing a DSL as part of an application.

The first argument that comes to mind is that if the application’s domain experts have a specific natural language or jargon of their own, writing a DSL makes it easy for programmers and domain experts to collaborate. While it is rare to find substantial applications entirely written by non-programmers at this time in any language, it is quite feasible for non-programmers to write or validate portions of an application representing its “business rules” or domain logic, while programmers maintain its infrastructure.

    include StarbucksDSL
order = latte venti, half_caf, non_fat, no_foam, no_whip
print order.prepare

Building Domain Specific Languages in Ruby


Another argument in favour of a DSL is that even when non-programmers are not involved directly in coding an application, the programmers themselves often have a jargon of their own to describe entities, algorithms and data structures in the application. Having portions of the application written in a language closely resembling the programmer’s own jargon makes it easy for them to read each other’s work and understand its intent.

Successful examples of DSLs embedded within existing languages and frameworks include Ruby on Rails’ ActiveRecord, where statements such as:

    has_and_belongs_to_many :Bar
validates_presence_of :blitz
some_bars = Bar.find_by_tavern_license(license_number)

Are self-documenting to anyone familiar with relational models.

The final argument I’ll repeat here is that a DSL is a very effective way to separate the what from the how of an algorithm. Separation of concerns is a desirable property of good programs, and DSLs provide this separation very clearly. In the ActiveRecord examples above, the exact mechanisms of relating tables, validating records, and performing searches is “abstracted away” from the code where the programmer declares how she would like the results used.

Freedom is Slavery

DSLs can be hacked together quickly in Ruby (whether they can be made sufficiently robust for your production needs may require considerably more care). Hacking a DSL together with little effort is a benefit, especially when prototyping: sometimes the best way to design a DSL is to try to use it, so you can discover what you need to express.



The Ruby Way is the perfect second Ruby book for serious programmers. The Ruby Way contains more than four hundred examples explaining how to do everything from distribute Ruby with Rinda to dynamic programming techniques just like these.

Some developers have raised the concern that extensive use of “magic” features leads to code that cannot be understood or maintained.1 My own feeling is that DSLs lead to code that is easier to understand, not more difficult to understand. This leaves an argument about maintenance. Some techniques for meta-programming, such as extending core classes like Array, have what you might call “non-local effects.”

For example, two different pieces of code might try to extend the same core class, interfering with each other. Each works in isolation and passes all of its unit tests. But when plugged into a larger application that uses them together, they break.


Lispers are among the best grads of the Sweep-It-Under-Someone-Else’s-Carpet School of Simulated Simplicity.

—Larry Wall


Another problem occurs with extending the Kernel class or creating “top level” methods to be used as verbs in a DSL. You end up with name space crowding: you must be very careful that you do not redefine en existing method.

To fix this problem, the code that implements the DSL needs to be contained so that it does not interfere with other code. We can still implement verbs as methods, but we must implement those methods in separate objects, classes, or modules.

Zen in the Art of Program Maintenance

An established technique for implementing methods in objects is to define the methods and then execute a block of code using instance_eval so that it has access to the object’s methods.


I’m trying to get the Zen of building DSLs using Ruby. After reading a dozen or so pieces referenced by my favourite search engine, I have a feeling I’m still not quite getting it.

Don Box


You know, code expresses an idea better than words express an idea… when the idea is about coding. Please try this example in irb. Don’t just skim the text and nod: there’s a powerful learning mechanism at work when you physically do things as you’re learning, even if it’s just copying, pasting, comparing the result in one window to the text in another, and so on:

def bjarne
'Barney'
end

dsl = Object.new
def dsl.phred
'Fred'
end

plus = ' plus '

print dsl.instance_eval {
phred + plus + bjarne
}
##### "Fred plus Barney"

What does this show? Well, we have created a way to use a method defined in our dsl object, a local variable plus, and a top-level method bjarne. We can imagine scaling this up to defining a rich DSL in our DSL object and being able to mix verbs from the DSL with instance variables and other methods as we please.

Touching back on the subject of containment, we have defined bjarne in Kernel. Now bjarne is essentially global. If we already defined bjarne somewhere else, we just clobbered it. And if we later run a piece of code that defines bjarne, we’ll clobber our own version. phred is different. It’s defined inside of an object, and it doesn’t conflict with any other phred we define elsewhere.

Great! So… Can we cite a few examples of this technique in action (such as Jamis’ post where he calls phred and bjarne examples of Sandboxing and Top-level methods) and end the post here?

No. The code above looks fine. But there is a hidden problem with this sandboxing technique:

MyDsl = Object.new

def MyDsl.phred
'Fred'
end

class ClientCode

def bjarne
'Barney'
end

def friends
plus = ' plus '
MyDsl.instance_eval { phred + plus + bjarne }
end

end

ClientCode.new.friends
##### -:15:in `friends': undefined local variable or method `bjarne' for # (NameError) from -:15:in `friends' from -:20

WTF?! This looks just like our top-level example, but we’ve placed our code inside of a ClientCode method. And bjarne is a method in ClientCode: this way we can continue to separate concerns, keeping phred inside our DSL and bjarne inside of the class where we are using the DSL. But it doesn’t work.

Why instance_eval breaks (in tedious detail)

As you know, everything in Ruby is either a variable or a method (how it figures out the difference is a major irritation). When you invoke a method, you are actually sending a message to a receiver.2 Sometimes you name the receiver (some_object.a_method), and there is no ambiguity.

But when you just name the method (like bjarne), Ruby tries to find the method for itself. It does so by looking to see whether it is an instance method, in which case it behaves like self.bjarne. If not, it looks to see whether bjarne is top-level, in which case it calls that method in the Kernel. See for yourself:

def foo
'top level foo'
end

def bar
'top level bar'
end

class Test
def bar
'instance method bar'
end
def test
p foo
p bar
end
end

Test.new.test
##### "top level foo" "instance method bar"

See? It looks for instance methods and then for top-level methods if it can’t find anything. (Again, we are hand-waving over the pesky problem with instance variables in the case where we don’t use ()). What’s the problem? Well, I actually mis-described what happens. Here it is again, with more precision:

It looks for methods defined in the object self, and then for top-level methods if it can’t find anything. Of course, self is the current object. Unless it isn’t: That’s what instance_eval does: it evaluates a block but it changes self to point to its receiver instead of the object where the code is executing. Everything else stays the same. One more example to show the mechanism:

def foo
'top level foo'
end

def bar
'top level bar'
end

class Test
def bar
'instance method bar'
end
def blitz
'current object blitz'
end
def test
p foo
p bar
o = Object.new
def o.blitz
'redefined self blitz'
end
p o.instance_eval { blitz }
p o.instance_eval { 'bar within o gives: ' + bar }
end
end

Test.new.test
##### "top level foo" "instance method bar" "redefined self blitz" "bar within o gives: top level bar"

Now we see: when we use instance_eval, we route around our current object and all of our methods are ignored within the block. Ruby really only has two levels of scope: whatever belongs to self and whatever belongs to Kernel.

This state of affairs is unsatisfactory: we would like to introduce a DSL in such a way that we retain access to all of our methods without kludges (like storing the current object in an instance variable).

Nesting Scopes



The Seasoned Schemer is devoted to the myriad uses of first class functions. This book is approachable and a delight to read, but the ideas are provocative and when you close the back cover you will be able to compose programs from functions in powerful new ways.

You can think of the current scope as being nested inside of the top-level scope. instance_eval doesn’t change the scope for things like local variables, it just points self elsewhere.

What we want is a new scope for our DSL nested inside of the current scope. So when we search for a method, we should check the DSL. If we don’t find it there, check the current object’s scope. If we don’t find it there, check the top-level.


Those who do not learn from the History of Lisp are doomed to repeat it.

Oops. John McCarthy called from 1960. He wants Lisp’s dynamic scoping back. Yes, our new feature is almost fifty years old. This is why either a through grounding in CS theory or a hobbyist’s interest in the history of programming are important for programming: much of what we want to do has already been done before, and sometimes in unexpected contexts. Who would have thought that a technique for helping programmers collaborate with Bond Traders has roots in Lisp 1.5?

Here’s an implementation of a nested scope construct that does exactly what we want. You declare a new class that extends DomainSpecificLanguage, and then you can use methods from your DSL, from your current object, and from the top-level (if you so choose). For example:

require 'dsl'

class MyDSL < DomainSpecificLanguage

def bjarne
'Barney'
end

end

class TheGreat

def phred
'Fredrick'
end

def test
plus = ' plus '
MyDSL.eval { p phred + plus + bjarne }
end

end

TheGreat.new.test
##### "Fredrick plus Barney"

This does exactly what we want with methods.

There's also a single extension to kernel, the method with. with replaces the eval method so you can also say:

with MyDSL do
p phred + plus + bjarne
end


The eval method creates a new instance of your DSL class, so you can track state within an evaluation. For example:

class Censor < DomainSpecificLanguage
attr_reader :ok_on_tv

def initialize (given_binding)
super(given_binding)
@ok_on_tv = true
end

def say something
something.split.each do |word|
@ok_on_tv = false if ['feces', 'urine', 'love', 'pudendum', 'fellator', 'oedipus', 'mammaries'].include?(word)
end
end

end

class GeorgeCarlin
def test
Censor.eval {
say "People much wiser than I have said, I'd rather have my son watch a film with two people making love than two people trying to kill one other."
say "And I of course agree. I wish I know who said it first, and I agree with that."
ok_on_tv
}
end
end

p GeorgeCarlin.new.test
##### "false"

let

The first obvious drawback of this approach is that the blocks we pass to eval cannot take parameters. For this reason, rumour has it that a method called instance_exec will be added to Ruby in 1.9. (There are some implementations available that work in Ruby 1.8 if you would like to experiment.)

The second is that you don’t get anything like nested local variables, a ‘la Pascal, Scheme, or any other language with block structure. Block structure is very powerful: You can use a variable within a particular scope and nowhere else. Here’s a trivial example:

with Let do
let :x => 0, :y => 1 do
assert_equal(1, x + y)
let :x => 2 do
assert_equal(3, x + y)
end
assert_equal(0, x)
end
end

We're using the with syntax. In the Let DSL, there’s a new method called let. let creates a new DSL within Let. You can see that re-declaring x does not clobber the value in the outer scope. That is because when let wrote a new DSL, it added x and y as methods.

So really, that block of code says “Write a new DSL where x and y are methods returning zero and one. Execute some code in that new DSL. That code will create another DSL where x is a method returning two.”

Because let defines methods and not local variables, bad things happen when you try to override real local variables. It’s best to use Let for some things and local variables for others, but not mix the two.

Like what, you ask?

List Comprehensions in Ruby

A List Comprehension is syntactic sugar that lets you build collections using set-like notation. For example, S = [ x | x<-[0..], x^2>3 ] is a list comprehension in Haskell.

Here is a List Comprehensions DSL in Ruby. Let’s say we’re building up a multiplication table. We want tuples of the form [x, y, x * y] given x is in the range 1..12 and y is in the range 1..12. Let’s write that:

require 'comprehension'

class MultiplicationTable
def twelve_by_twelve
with Comprehension::DSL do
list { [x, y, x * y] }.given(:x => 1..12, :y => 1..12)
end
end
end
p MultiplicationTable.new.twelve_by_twelve
##### [[1, 1, 1], [1, 2, 2], [2, 1, 2], [1, 3, 3], [2, 2, 4] ...

(In everyday use, you don’t need a class and a method for each comprehension: the important bit is list { [x, y, x * y] }.given(:x => 1..12, :y => 1..12). I just wrote it this way so you could see that comprehensions work fine inside of methods. You can also use more than one comprehension inside of a single with Comprehension::DSL do... end block: see the unit tests for examples.)

The expression in the block doesn’t have to be a tuple:

class MultiplicationTable
def twelve_by_twelve
with Comprehension::DSL do
list { "#{x} times #{y} is #{x * y}" }.given(:x => 1..12, :y => 1..12)
end
end
end
p MultiplicationTable.new.twelve_by_twelve
##### ["1 times 1 is 1", "1 times 2 is 2", "2 times 1 is 2", "1 times 3 is 3", "2 times 2 is 4", ...

And you can stick a “where” block on the end:

class MultiplicationTable
def twelve_by_twelve_odds
with Comprehension::DSL do
list { "#{x} times #{y} is #{x * y}" }.given(:x => 1..12, :y => 1..12) { (x % 2 == 1) && (y % 2 == 1) }
end
end
end
p MultiplicationTable.new.twelve_by_twelve_odds
##### ... 3 times 5 is 15", "5 times 3 is 15", "7 times 1 is 7", "1 times 9 is 9", ...


Would you like to nest them? Your expression is the interpreter’s command:

class MultiplicationTable
def odds_times_evens
with Comprehension::DSL do
list { "#{x} times #{y} is #{x * y}" }.given(
:x => list { x }.given(:x => 1..12) { x % 2 == 0 } ,
:y => list { x }.given(:x => 1..12) { x % 2 == 1 } )
end
end
end
p MultiplicationTable.new.odds_times_evens
##### ... "2 times 11 is 22", "4 times 9 is 36", "6 times 7 is 42", ...

List Comprehensions and Let

What is the relationship to Let? Well, Let builds the scopes needed for evaluating the where clause and the block defining the elements of the list. Yes, we’ve built a DSL on top of a DSL on top of a DSL. Does this seem like weird trickery? I don’t know why. Do you have any idea how many levels of abstraction are responsible for you reading this essay right now?

This is what we humans do: we build tools on top of tools. Your browser runs on an OS, possibly in a VM, perhaps in a hypervisor, on top of a BIOS, and on and on. This is the normal state of affairs, not an exception.

Closing Remarks

It is possible to build DSLs in Ruby to facilitate cross-functional teamwork and separation of concerns. Care must be taken to avoid polluting the top-level name space, but it is possible to work within sandboxes and still have access to the current object’s context.

Oh yes, and programming is fun as always

Source Code

Update: The copy of dsl.rb has been updated to the latest version. I had committed a rather typical manual synchronization error: I copied the latest file to the wrong directory when I first posted this. Thanks, Justin!



How to try it for yourself: Open DomainSpecificLanguage and Let. Save the text only (not the HTML) as dsl.rb. Open Comprehension. Save the text only as anything you like, as long as it is in the same directory as dsl.rb: I use comprehension.rb. Run comprehension.rb.

  1. I generally call “Bullshit!” on any line of reasoning that sets up a straw man argument just to knock it down. So read on with skepticism!

  2. Alan Kay has said that he regrets popularizing the notion of “Object-Oriented” programming, and that he should have called it “Message-Oriented” programming.

Labels: , ,

 

Sunday, March 11, 2007
  Why Why Functional Programming Matters Matters
I recently re-read the amazing paper Why Functional Programming Matters (“WhyFP”). Although I thought that I understood WhyFP when I first read it a few years ago, when I had another look last weekend I suddenly understood that I had missed an important message.1

Now obviously (can you guess from the title?) the paper is about the importance of one particular style of programming, functional programming. And when I first read the paper, I took it at face value: I thought, “Here are some reasons why functional programming languages matter.”

On re-reading it, I see that the paper contains insights that apply to programming in general. I don’t know why this surprises me. The fact is, programming language design revolves around program design. A language’s design reflects the opinions of its creators about the proper design of programs.

In a very real sense, the design of a programming language is a strong expression of the opinions of the designer about good programs. When I first read WhyFP, I thought the author was expressing an opinion about the design of good programming languages. Whereas on the second reading, I realized he was expressing an opinion about the design of good programs.

Can we add though subtraction?

It is a logical impossibility to make a language more powerful by omitting features, no matter how bad they may be.

Is this obvious? So how do we explain that one reason Java is considered “better than C++” is because it omits manual memory management? And one reason many people consider Java “better than Ruby” is because you cannot open base classes like String in Java? So no, it is not obvious. Why not?

The key is the word better. It’s not the same as the phrase more powerful.2 The removal or deliberate omission of these features is an expression about the idea that programs which do not use these features are better than programs which do. Any feature (or removal of a feature) which makes the programs written in the language better makes the language better. Thus, it is possible to make a language “better” by removing features that are considered harmful,3 if by doing so it makes programs in the language better programs.

In the opinion of the designers of Java, programs that do not use malloc and free are safer than those that do. And the opinion of the designers of Java is that programs that do not modify base classes like String are safer than those that do. The Java language design emphasizes a certain kind of safety, and to a Java language designer, safer programs are better programs.

“More powerful” is a design goal just like “safer.” But yet, what does it mean? We understand what a safer language is. It’s a language where programs written in the language are safer. But what is a “more powerful” language? That programs written in the language are more powerful? What does that mean? Fewer symbols (the “golf” metric)?

WhyFP asserts that you cannot make a language more powerful through the removal of features. To paraphrase an argument from the paper, if removing harmful features was useful by itself, C and C++ programmers would simply have stopped using malloc and free twenty years ago. Improving on C/C++ was not just a matter of removing malloc and free, it was also a matter of adding automatic garbage collection.

This space, wherein the essay ought to argue that Java compensates for its closed base classes by providing a more powerful substitute feature, left intentionally blank.

At the same time, there is room for arguing that some languages are improved by the removal of harmful features. To understand why they may be improved but not more powerful, we need a more objective definition of what it means for a language to be “more powerful.” Specifically, what quality does a more powerful programming language permit or encourage in programs?

When we understand what makes a program “better” in the mind of a language designer, we can understand the choices behind the language.

Factoring

Factoring a program is the act of dividing it into units that are composed to produce the working software.4 Factoring happens as part of the design. (Re-factoring is the act of rearranging an existing program to be factored in a different way). If you want to compare this to factoring in number theory, a well designed program has lots of factors, like the number 3,628,800 (10!). A Big Ball of Mud is like the number 3,628,811, a prime.

Composition is the construction of programs from smaller programs. So factoring is to composition as division is to multiplication.

Factoring programs isn’t really like factoring simple divisors. The most important reason is that programs can be factored in orthogonal ways. When you break a program into subprograms (using methods, subroutines, functions, what-have-you), that’s one axis of factoring. When you break an a modular program up into modules, that’s another, orthogonal axis of factoring.

Programs that are well-factored are more desirable than programs that are poorly factored.

In computer science, separation of concerns (SoC) is the process of breaking a program into distinct features that overlap in functionality as little as possible. A concern is any piece of interest or focus in a program.

SoC is a long standing idea that simply means a large problem is easier to manage if it can be broken down into pieces; particularly so if the solutions to the sub-problems can be combined to form a solution to the large problem.

The term separation of concerns was probably coined by Edsger W. Dijkstra in his paper On the role of scientific thought.

—Excerpts from the Wikipedia entry on Separation of Concerns

Programs that separate their concerns are well-factored. There’s a principle of software development, responsibility-driven design. Each component should have one clear responsibility, and it should have everything it needs to carry out its responsibility.

This is the separation of concerns again. Each component of a program having one clearly defined responsibility means each concern is addressed in one clearly defined place.

Let’s ask a question about Monopoly (and Enterprise software). Where do the rules live? In a noun-oriented design, the rules are smooshed and smeared across the design, because every single object is responsible for knowing everything about everything that it can ‘do’. All the verbs are glued to the nouns as methods.
My favourite interview question


In a game design where you have important information about a rule smeared all over the object hierarchy, you have very poor separation of concerns. It looks at first like there’s a clear factoring “Baltic Avenue has a method called isUpgradableToHotel,” but when you look more closely you realize that every object representing a property is burdened with knowing almost all of the rules of the game.



The Seasoned Schemer is devoted to the myriad uses of first class functions. This book is approachable and a delight to read, but the ideas are provocative and when you close the back cover you will be able to compose programs from functions in powerful new ways.

The concerns are not clearly separated: there’s no one place to look and understand the behaviour of the game.

Programs that separate their concerns are better programs than those that do not. And languages that facilitate this kind of program design are better than those that hamper it.

Power through features that separate concerns

One thing that makes a programming language “more powerful” in my opinion is the provision of more ways to factor programs. Or if you prefer, more axes of composition. The more different ways you can compose programs out of subprograms, the more powerful a language is.

Do you remember Structured Programming? The gist is, you remove goto and you replace it with well-defined control flow mechanisms: some form of subroutine call and return, some form of selection mechanism like Algol-descendant if, and some form of repetition like Common Lisp’s loop macro.

Dijkstra’s view on structured programming was that it promoted the separation of concerns. The factoring of programs into blocks with well-defined control flow made it easy to understand blocks and rearrange programs in different ways. Programs with indiscriminate jumps did not factor well (if at all): they were difficult to understand and often could not be rearranged at all.

Structured 68k ASM programming is straightforward in theory. You just need a lot of boilerplate, design patterns, and the discipline to stick to your convictions. But of course, lots of 68k ASM programming in practice is only partially structured. Statistically speaking, 68k ASM is not a structured programming language even though structured programming is possible in 68k ASM.

Structured Pascal programming is straightforward both in theory and in practice. Pascal facilitates separation of concerns through structured programming. So we say that Pascal “is more powerful than 68k ASM” to mean that in practice, programs written in Pascal are more structured than programs written in 68k ASM because Pascal provides facilities for separating concerns that are missing in 68k ASM.

For example: working with lists

Consider this snippet of iterative code:


int numberOfOldTimers = 0;
for (Employee emp: employeeList) {
for (Department dept: departmentsInCompany) {
if (emp.getDepartmentId() == dept.getId() && emp.getYearsOfService() > dept.getAge()) {
++numberOfOldTimers;
}
}
}


This is an improvement on older practices.5, 6 For one thing, the for loops hide the implementation details of iterating over employeeList and departmentsInCompany. Is this better because you have less to type? Yes. Is it better because you eliminate the fence-post errors associated with loop variables? Of course.

But most interestingly, you have the beginnings of a separation of concerns: how to iterate over a single list is separate from what you do in the iteration.

Try calling a colleague on the telephone and explaining what we want as succinctly as possible. Do you say “We want a loop inside a loop and inside of that an if, and…”? Or do you say “We want to count the number of employees that have been with the company longer than their departments have existed.”

One problem with the for loop is that it can only handle one loop at a time. We have to nest loops to work with two lists at once. This is patently wrong: there’s nothing inherently nested about what we’re trying to do. We can demonstrate this easily: try calling a colleague on the telephone and explaining what we want as succinctly as possible. Do you say “We want a loop inside a loop and inside of that an if, and…”?

No, we say, “We want to count the number of employees that have been with the company longer than their departments have existed.” There’s no discussion of nesting.

In this case, a limitation of our tool has caused our concerns to intermingle again. The concern of “How to find the employees that have been with the company longer than their departments have existed” is intertwined with the concern of “count them.” Let’s try a different notation that separates the details of how to find from the detail of counting what we’ve found:


old_timers = (employees * departments).select do |emp, dept|
emp.department_id == dept.id && emp.years_of_service > dept.age
end
number_of_old_timers = old_timers.size


Now we have separated the concern of finding from counting. And we have hidden the nesting by using the * operator to create a Cartesian product of the two lists. Now let’s look at what we used to filter the combined list, select. The difference is more than just semantics, or counting characters, or the alleged pleasure of fooling around with closures.



I’m not a Haskell user (yet), but The Haskell School of Expression: Learning Functional Programming through Multimedia has received rave reviews and comes with solid recommendations. It’s on my wish list if you’re feeling generous!

* and select facilitates separating the concerns of how to filter things (like iterate over them applying a test) from the concern of what we want to filter. So languages that make this easy are more powerful than languages that do not. In the sense that they facilitate additional axes of factoring.

The Telephone Test

Let’s look back a few paragraphs. We have an example of the “Telephone Test:” when code very closely resembles how you would explain your solution over the telephone, we often say it is “very high level.” The usual case is that such code expresses a lot more what and a lot less how. The concern of what has been very clearly separated from the concern of how: you can’t even see the how if you don’t go looking for it.

In general, we think this is a good thing. But it isn’t free: somewhere else there is a mass of code that supports your brevity. When that extra mass of code is built into the programming language, or is baked into the standard libraries, it is nearly free and obviously a Very Good Thing. A language that doesn’t just separate the concern of how but does the work for you is very close to “something for nothing” in programming.

But sometimes you have to write the how as well as the what. It isn’t always handed to you. In that case, it is still valuable, because the resulting program still separates concerns. It still factors into separate components. The components can be changed.

I recently separated the concern of describing “how to generate sample curves for some data mining” from the concern of “managing memory when generating the curves.” I did so by writing my own lazy evaluation code (Both the story and the code are on line). Here’s the key “what” code that generates an infinite list of parameters for sample beziér curves:


def magnitudes
LazyList.binary_search(0.0, 1.0)
end

def control_points
LazyList.cartesian_product(magnitudes, magnitudes) do |x, y|
Dictionary.new( :x => x, :y => y )
end
end

def order_one_flows args = {}
height, width = (args[:height] || 100.0), (args[:width] || 100.0)
LazyList.cartesian_product(
magnitudes, control_points, control_points, magnitudes
) do |initial_y, p1, p2, final_y|
FlowParams.new(
height, width, initial_y * height,
CubicBezierParams.new(
:x => width, :y => final_y * height,
:x1 => p1.x * width, :y1 => p1.y * height,
:x2 => p2.x * width, :y2 => p2.y * height
)
)
end
end


That’s it. Just as I might tell you on the phone: “Magnitudes” is a list of numbers between zero and one created by repeatedly dividing the intervals in half, like a binary search. “Control Points” is a list of the Cartesian product of magnitudes with itself, with one magnitude assigned to x and the other to y. And so forth.

I will not say that the sum of this code and the code that actually implements infinite lists is shorter than imperative code that would intermingle loops and control structures, entangling what with how. I will say that it separates the concerns of what and how, and it separates them in a different way than select separated the concerns of what and how.

So why does “Why Functional Programming Matters” matter again?

The great insight is that better programs separate concerns. They are factored more purely, and the factors are naturally along the lines of responsibility (rather than in Jenga piles of abstract virtual base mixin module class proto_ extends private implements). Languages that facilitate better separation of concerns are more powerful in practice than those that don’t.

WhyFP illustrates this point beautifully with the same examples I just gave: first-class functions and lazy evaluation, both prominent features of modern functional languages like Haskell.

WhyFP’s value is that it expresses an opinion about what makes programs better. It backs this opinion up with reasons why modern functional programming languages are more powerful than imperative programming languages. But even if you don’t plan to try functional programming tomorrow, the lessons about better programs are valuable for your work in any language today.

That’s why Why Functional Programming Matters matters.


  1. And now I’m worried: what am I still missing?

  2. Please let’s not have a discussion about Turing Equivalence. Computer Science “Theory” tells us “there’s no such thing as more powerful.” Perhaps we share the belief that In theory, there’s no difference between theory and practice. But in practice, there is.

  3. I am not making the claim that I consider memory management or unsealed base classes harmful, but I argue that there exists at least one person who does.

  4. The word “factor” has been a little out of vogue in recent times. But thanks to an excellent post on reddit, it could make a comeback.

  5. So much so that we won’t even bother to show what loops looked like in the days of for (int i = 0; i < employeeList.size(); ++i).

  6. Another organization might merge employees and departments, or have each department “own” a collection of employees. This makes our example easier, but now the data doesn’t factor well. Everything we’ve learned from databases in the last forty years tells us that we often need to find new ways to compose our data. The relational model factors well. The network model factors poorly.

Labels: , , ,

 

Friday, February 09, 2007
  Program in Java? You must be joking!

The Y combinator design pattern in Java is easily understood and can be used and maintained by unskilled, entry-level programmers.

Cuius rei demonstrationem mirabilem sane detexi. Hanc marginis exiguitas non caperet.

You know, this kind of joke seems to rile Java apologists to no end. They come out of the woodwork with their web browsers set to flame. Absolutely no criticism of the language that powers everything from the web to space exploration (although there is never any talk of toasters) is allowed.

And do not criticize the culture in any way whatsoever. Although it’s perfectly ok to boast that Java is designed to appeal to the widest possible diversity of skill levels, you may not suggest that Java programmers are stupid. Or else.


silly daddy!, originally uploaded by thomas.braithwaite.


I am trying not to tell anyone what to do, but I have an observation. Have you ever heard a politically incorrect but extremely funny joke about a member of a particular culture? You know you have. And furthermore, the joke was probably told by a member of the group victimized in the joke.

The unwritten rule is, if you’re a member, jokes are fair game. I was once with some, ahh, members of a religion that directly predated and evolved into Christianity. They were telling some jokes about their culture and religion. I had heard a few such jokes, but I wisely refrained from telling any. It’s not allowed. That’s the rule: outsiders may not joke.

So back to Java. Guess what? I’m an insider. I write Java code every working day. If you care about hollow appeals to authority, I once wrote a Scheme implementation in Java. I was also the team lead for JProbe Threadalyzer, a tool that analyses multi-threaded Java behaviour. And the development manager for JProbe Server-Side Suite (the aforementioned thread analyser, plus profiling, code coverage, and memory debugging tools). And various J2EE implementations with various degrees of Enterprisy-ness.

None of that makes me an expert. Nor does it make me right when I criticize or joke. But it does give me a certain smug right to joke. And don’t we all need a laugh from time to time, even if we’re laughing at ourselves? Perhaps especially if we're laughing at ourselves?

I think it’s a sign of good health to be able to laugh at ourselves and criticize our foibles. For all of the talk of Java as a “mature platform,” don’t you agree that “not taking criticism well” is a little, well, immature?

So since it’s Friday:


A Muslim, a Vegetarian, and a Java Programmer are traveling by foot, and they stop at a farm house to sleep for the night. The farmer is impressed at the obvious sophistication of the Java Programmer’s tales of Enterprise wonder, and he invites her into the house. The Muslim he sends to the hayloft, and the Vegetarian can sleep in the barn.

Well, the farmer is just pouring a night-cap and listening to the Java Programmer describe the time she knocked together a farm workflow application in less than a million lines of XML configuration code when there’s a knocking on the door.

He opens the door and the Vegetarian is standing there. “I’m sorry,” the Vegetarian apologizes, “But you slaughter animals in the barn, and eating meat is offensive to my beliefs. I cannot sleep in the barn.” The farmer thinks this is bunkum, but he was raised to be courteous to his guests, so he asks the Vegetarian to swap places with the Muslim.

The farmer knocks back his drink and turns down the lights. He can hear the Java Programmer setting up a sleeping bag factory to generate down-filled singleton sleeping containers in the living room. His wife is reading in bed, and he’s looking forward to catching up on the Wall Street Journal.

Well, he is just about to climb into bed when there’s a banging on the door. He opens the door, and the Muslim is standing there. “I’m sorry,” the Muslim apologizes, “But you keep pigs in the barn, and pigs are profane according to my beliefs. I cannot sleep in the barn.”

Muttering, the farmer rouses the Java Programmer off the couch and asks her to switch with the Muslim. He climbs into bed and has just started to read an interesting article on hedging commodity futures with convex derivatives when there’s a thunderous hammering at the door. His wife tells him to stay put and she goes to answer it. The farmer hears some excited talking, and a moment later his wife is at the bedroom door.

“Honey,” she says, “it’s the pigs.”

Labels: ,

 

Tuesday, February 06, 2007
  But Y would I want to do a thing like this?

Choose life. Choose a job. Choose a starter home. Choose dental insurance, leisure wear and matching luggage. Choose your future. But Y would I want to do a thing like that?
Writing about first-class functions and their compatibility with object-oriented programming naturally leads to the Y combinator. And that is the point where eyes glaze over and soft, snoring sounds rise from RSS readers everywhere.

But please bear with me, this essay is not really about the Y combinator, it’s about learning new things and expanding our capacity to think.

Sharpening the saw

Years ago I picked up Steven Covey’s book The 7 Habits of Highly Effective People. If the book is a test out of seven, I really wasn’t doing very well.

If you’ve read the book, you probably remember that he talked about “Sharpening the Saw,” investing in your own abilities. That’s incredibly important, but I don’t need to tell you that. If you exercise with programming katas, or learn a new programming language once a year, or pick up a book like The Reasoned Schemer and actually go through the exercises, then you are already in the top 1% of software developers for personal skills improvement. (Sorry, certifications don’t count. They are the classic case of doing the wrong thing for the wrong reason!)

New ideas—by which I mean, new to you—are an important way to sharpen your saw. If for no other reason than this: the brain needs regular exercise to perform at or near its potential. Learning new things keeps you sharp, even if you don’t directly use the things you learned.

Others have suggested that learning Lisp is beneficial to your programming skills in its own right. That’s one good way to sharpen your saw. But I add to that an important caveat: to obtain the deepest benefit from learning a new language, you must learn to think in the new language, not just learn to translate your favourite programming language syntax and idioms into it.

Think different

The interesting thing about that is that almost by definition, if you see something in, say, Lisp that solves a problem you already have, you won’t learn much from the Lisp code. It is tempting to think that Lisp (or any other language) will somehow do what you’re already doing in some wonderfully magic way that is obviously better. But no, that isn’t how it really works.

For your problems are tuned to your existing tools. You simply can’t imagine problems that your tools can’t solve well, much less can’t solve at all. That’s why there are so few continuation-based web servers. Who’s going to invent one unless they have a programming paradigm with continuations?


To Mock a Mockingbird: The most enjoyable text on the subject of combinatory logic ever written. What other textbook features starlings, kestrels, and other songbirds?
And worse, when a new tool is applied to a problem you think you know well, you will probably dismiss the things the new tool does well. Look at how many people dismiss brevity of code. Note that all of the people ignore the statistics about the constant ratio between bugs and lines of code use verbose languages. Look at how many people dismiss continuation-based servers as a design approach. Note that all of them use programming languages bereft of control flow abstractions.

Thus, to truly learn a new tool, you must not just learn the new tool, you must apply it to new kinds of problems. It’s no good learning to replace simple iteration with first-class functions: all you’ve learned is syntax. To really learn first-class functions, you must seek out problems that aren’t easily solved with iteration.

The Why of Y

Which leads me back to fixed point combinators. They appear to have no practical (as in making money) use. And that’s why I’m suggesting to you that you figure out how to make one in your language of choice. The very fact that the problem is far outside of your realm of “practicality” guarantees that you will learn something. You won’t be simply applying your same-old, same-old techniques and patterns to a slightly new problem.

Start your research with Richard P. Gabriel’s The Why of Y. Try porting his examples directly to your favourite programming language. If what you want to use is too brain-damaged to support closures, you may need to do a little greenspunning and build a little functor infrastructure.

Don’t be dissuaded if you have to follow the functor route: you are learning far more about your language and about programming in general than the shmoes that settle for learning five new buzzwords related to the latest WS-* interoperability with XPath 3.x.

If you prefer a fun approach to learning, you can do not better than Raymond Smullyan’s To Mock a Mockingbird: an enjoyable romp through the world of combinatory logic. After reading this book, you will have mastered the S, K, I, Y, and other combinators. Added bonuses include a safe that can only be opened by applying Gödel’s Incompleteness Theorem to its combination. How can you read this book and not learn?

Eating my own dog food

I thought of a few things to say along these lines last week and then I abruptly realizing I was asking you to “Do as I say, not as I do.” What good is recycling problems I first encountered in University textbooks two decades ago? I put this post aside and set to work on a problem of my own.

I set out to write a function for making recursive functions—a function extensionally equal to the Y combinator—in Ruby. The ultimate goal is to take something like:

lambda { |f| lambda { |n| n.zero? && 1 or f.call(n-1) } }


And be able to have it call itself recursively. In this case, to compute the factorial function.

This is trivial, given that Ruby supports named recursion, but if you want to write a fixed-point combinator you want to write a function that makes recursive functions without using the host language’s support for named recursive calls. In other words, you are bootstrapping named recursion out of anonymous first-class functions.1

There are important theoretical implications of being able to do this, but the killer reason to try it is to learn.

I started my quest for a function for making recursive functions with a rather trivial observation based on OO programming and the Curry function:

require 'test/unit'

class ExempliGratia < Test::Unit::TestCase

CURRY = lambda { |f, a| lambda { |*b| f.call(a, *b) } }

def test_recursive_curry
maker = lambda { |func_with_me|
CURRY.call(func_with_me, func_with_me)
}
assert_equal(120, maker.call(lambda { |me, n| n.zero? && 1 or n * me.call(me, n-1) }).call(5))
end

end


In OO some language implementations, this (or self) is a hidden parameter passed to each method. Thus, there’s a parameter—me in the example code—that is added for handling recursion. If you write a recursive function—like the venerable factorial—with the extra me parameter, a trivial currying operation evaluates it recursively without any need for names.

This is obviously deficient. As noted above, we want to write factorial like so:

lambda { |n| n.zero? && 1 or f.call(n-1) }


We’ll need an f from somewhere, and just as our Scheme colleagues do, we’ll bind one as a parameter in an enclosing lambda. So we want to write:

lambda { |f| lambda { |n| n.zero? && 1 or f.call(n-1) } }


And somehow this should be transformed into a working factorial function. For the test-driven crowd, we want to write:

def test_clean_up_loose_ends
maker = ...

factorial = maker.call(
lambda { |f| lambda { |n| n.zero? && 1 or n * f.call(n-1) } }
)
assert_equal(120, factorial.call(5))

iterative_factorial = maker.call(
lambda { |f| lambda { |n, acc| n.zero? && acc or f.call(n - 1, n * acc) } }
)
tail_factorial = lambda { |n| iterative_factorial.call(n, 1) }
assert_equal(120, tail_factorial.call(5))
end


Of course, we need some code for maker. And the iterative_factorial case shows that maker works for functions with more than one parameter. The solution I came up with is:

CURRY = lambda { |f, a| lambda { |*b| f.call(a, *b) } }
maker = lambda { |f|
lambda { |func_with_me| CURRY.call(func_with_me, func_with_me) }.call(
CURRY.call(lambda { |inner_func, me, *args|
inner_func.call(CURRY.call(me, me)).call(*args) }, f)) }


The source code with each transformation from beginning to end is here (I strongly suspect that this “curry combinator” is actually the Y combinator with a huge amount of cruft hanging off it).

Unique or derivative, crap or craft, the process of getting it to work has enriched my mind by forcing me outside of my usual problem space. I still can’t think of a practical application for what I’ve just written. But I know I’ve stretched myself.

And now back to you: perhaps you’re rushing off to try to implement a fixed-point combinator from first principles. Perhaps your plan is to code the canonical examples in your usual language. Those are both good paths. But whether you follow them today or not, remember the underlying principle exemplified by the fixed-point combinator:

Do not dismiss impractical or weird problems. While you may not have an immediate application for the code you write to solve such problems, you are maximizing your learning when you venture outside of your usual problem space.



  1. Named recursion is stuff like foo = lambda { |...| foo.call(:bar) }. It takes advantage of the host language’s variable binding to recurse. If you want anonymous recursion, you should be able to assign the same lambda to another name and have it work just as well, as in: fufu = lambda { |...| foo.call(:bar) }. That won’t work if you are relying on Ruby’s name for foo.

p.s. Don’t miss Tom Moertel’s derivation of the Y combinator in Ruby.

Labels: , , ,

 

Wednesday, January 31, 2007
  Closures and Higher-Order Functions
There has been a great deal of interest in closures lately, driven in great part by the fact that there is talk of adding some form of anonymous functions to the Java. Most of the time, people talk about “adding closures” to Java, and that prompts a flurry of questions of the form “what is a closure and why should I care?”

The discussion around closures tends to go on and on about the “closing over” of free variables and only lightly touch on the biggest change to Java: functions as first-class objects with a lightweight syntax for creating them. Making it easy to do something basic like define a new function is more than just a little syntactic sugar: it makes it easy to do new things with functions that were impractical when you needed a lot of boilerplate to make anything work.

Without understanding functional programming, you can’t invent MapReduce, the algorithm that makes Google so massively scalable.
—Joel Spolsky, The Perils of JavaSchools

I’m going to try to explain first class functions using Ruby (it is possible to write code that does exactly the same thing using the current Java feature set, however the result is so wordy that it obscures the basic idea being presented: call it accidental complexity, or perhaps yellow code.)

Ruby is a good language for demonstrating features that ought to be in Java. Like Java, Ruby uses squiggly brace syntax. Like Java, everything in Ruby is an object—whoops, Java has primitives. Okay, like Java, functions are represented as objects.

In Java you write:

interface IFromIAndI {
Integer call(Integer a, Integer b);
}

IFromIAndI add_two_integers = new IFromIAndI() {
public Integer call(final Integer a, final Integer b) {
return a + b;
}
};

(The Java convention is to name things in lowerCamelCase, but we’ll ignore that. If you need to print this essay on a dot-matrix printer you may want to make some changes first.)

In Ruby you write the function as:

add_two_integers = lambda { |a,b| a + b }

Later on, when you want to call your function in Java, you write:

add_two_integers.call(35, 42);

And if you like semicolons, you write the exact same thing in Ruby:

add_two_integers.call(35, 42);

You can do the same thing with multiplication:

multiply_two_integers = lambda { |a,b| a * b }

First Class Functions

In the examples above, functions look a little like methods. The Java version is obviously implemented as a method. But what we did in both cases was assign the resulting function to a variable. In Java, assigning a method to a variable is not particularly easy (it is possible using reflection).

Anything that can be assigned to a variable is a value. If it can also be passed as a parameter or returned from a method (or function), we say it is a first class value. Functions as first class values, or first class functions, are very interesting. For example, what can we do passing a function as a parameter to another function?

Hmmm. Well, I am breaking a cardinal rule of selling something. We’re talking about shiny new toys without identifying a problem to be solved. Let’s talk about my favourite problem: writing the same thing more than once, violating the DRY principle.

Here are two pieces of similar Ruby code:

adder_wth_acc = lambda { |acc, list|
if list.empty?
acc
else
adder_wth_acc.call(acc + list.first, list[1..-1]) # [1..-1] returns a copy of the list without the first element
end
}
adder = lambda { |list|
adder_wth_acc.call(0, list)
}
adder.call([1, 2, 3, 4, 5])

And:

multiplier_with_acc = lambda { |acc, list|
if list.empty?
acc
else
multiplier_with_acc.call(acc * list.first, list[1..-1]) # [1..-1] returns a copy of the list without the first element
end
}
multiplier = lambda { |list|
multiplier_with_acc.call(1, list)
}
multiplier.call([1, 2, 3, 4, 5])

What do they both do? Pretty much the same thing: they accumulate the result of some binary operation over a list of values. adder accumulates addition, and multiplier accumulates multiplication. You could call this a “Design Pattern.” If you did that, you would use the exact chunk of code everywhere. I would call that retrograde. Didn’t our predecessors invent the subroutine so we could eliminate writing the exact same piece of code over and over again?

Why can’t we do the same thing? Well, we can. A subroutine does the same thing over and over again, but it takes different parameters as it goes. What is different between adder and multiplier? Ah yes, the adding and multiplying. Functions. What we want is a function that takes a function as a parameter.

Well, we said that with first-class functions, functions are values and can be passed as parameters. Let’s try it:

folder = lambda { |default_value, binary_function, list|
fold_with_acc = lambda { |acc, list|
if list.empty?
acc
else
fold_with_acc.call(binary_function.call(acc, list.first), list[1..-1])
end
}
fold_with_acc.call(default_value, list)
}

Now we can use our function that takes functions as a parameter:

folder.call(0, add_two_integers, [1, 2, 3, 4, 5])
folder.call(1, multiply_two_integers, [1, 2, 3, 4, 5])

This is much better. When functions can take functions as parameters, we can build abstractions like folder and save ourselves a lot of code. Note that this would be a lot harder to read if we had to surround all of our functions with Object boilerplate in Java. That’s one of the key reasons why ‘syntactic sugar’—making it brief—is a big win.

And you know what? Functions are values, not just variables that happen to hold functions. These work just as well:

folder.call(0, lambda { |a, b| a + b }, [1, 2, 3, 4, 5])
folder.call(1, lambda { |a, b| a * b }, [1, 2, 3, 4, 5])

There’s just one problem (actually two, but I’m saving one for later): everywhere you use our new folder function, you need to remember that add_two_integers needs a default value of zero, but multiply_two_integers needs a default value of one. That’s bad. Sooner or later you will get this wrong.

What we need is a way to call folder without having to always remember the correct initial value. Should we extend our understanding of a function to include a default initial value for folding? If we’re thinking in Java, maybe our IFromIAndI interface needs a getDefaultFoldValue? I think not. Why should a function know anything about how it’s used? And besides, as we build other abstractions out of functions we’ll need more stuff.

If we aren’t careful, we’ll end up implementing the Visitor pattern on functions, and all of our brevity will go out the window. No, what we want is this: in one place we define that folding addition starts with a default value of zero and in another place we say we want to fold, say, [1, 2, 3, 4, 5] with addition. Then when we want to fold something else with addition, like [2, 4, 6, 8, 10], we shouldn’t have to say anything about zero again.

Adding Curry

What we need is a function that folds addition. Didn’t we say that functions are values that can be returned from functions? How about a function that makes a folding function? We should pass it our initial value and our binary function, and it should return a function that performs the fold without needing an initial value as a parameter:

fold_coder = lambda { |default_value, binary_function|
fold_with_acc = lambda { |acc, list|
if list.empty?
acc
else
fold_with_acc.call(binary_function.call(acc, list.first), list[1..-1])
end
}
lambda { |list|
fold_with_acc.call(default_value, list)
}
}

Now we can do the following:

adder = fold_coder.call(0, lambda { |a, b| a + b })
adder.call([1, 2, 3, 4, 5])
adder.call([2, 4, 6, 8, 10])

No more remembering that addition starts with a default of zero.

Actually, there’s a far simpler way to avoid having to remember the default value when you want to fold over addition. But let’s just play along so that we don’t have to come up with an entirely new set of examples to demonstrate the value of functions as first-class values.
Functional programmers (as opposed to the rest of us dysfunctional programmers) will recognize this as currying our folder function. Currying is when a function takes more than one parameter and you combine one of the parameters and the function to produce a function that takes fewer parameters.

Here’s a currying function in Ruby:

curry = lambda { |fn,*a|
lambda { |*b|
fn.call(*(a + b))
}
}

(This is an improvement on an earlier version, thanks to Justin's comment.)

So you can use our new function to create an increment function out of our adder and a treble function out of our multiplier:

plus_one = curry.call(add_two_integers, 1)
times_three = curry.call(multiply_two_integers, 3)


If you are ever asked, “what good is currying?,” I hope I’ve given you an example you can use to explain why currying matters, and why people do it all the time (possibly without explicitly naming it). Although it doesn’t look like much when looking at trivial examples like functions that multiply by three, it’s much more useful when creating folders and mappers where you want some of the parameters to remain constant.

Composition

Our examples combined functions and non-functions to create new functions. Here’s an example from a recent post, Don’t Overthink FizzBuzz, where I give a method for composing two functions. The idea is that if you have multiple functions that each take one argument, you can combine them using compose. I also have a method that generates functions, carbnation:

# fizzbuzz.rb

def compose *lambdas
if lambdas.empty?
lambda { nil }
elsif lambdas.size == 1
lambdas.first
else
lambda do |n|
lambdas.first.call(compose(*lambdas[1..-1]).call(n))
end
end
end

def carbonation(modulus, printable_form)
i = 0
lambda do |n|
(i = (i + 1) % modulus) == 0 && printable_form || n
end
end

print(((1..100).map &compose( carbonation(15, 'FizzBuzz'), carbonation(5, 'Buzz'), carbonation(3, 'Fizz'))).join(' '))


The simple explanation of how it works is that carbonation generates functions that replace every so many elements of a list with a printable string. Compose composes any two or more methods together. So if you want to print out 100 numbers, but replace every third number with “Fizz,” every fifth with “Buzz,” and all those that are third and fifth with “FizzBuzz,” you generate a function for each replacement, compose them together with compose, and then map the numbers from one to one hundred to the resulting überfunction.

When you look at this today, it seems weird and unreadable by Java standards. I wonder if adding first-class functions with simple syntax to Java will lead the Java community to a place where code like this will not appear out of place?

Just one more thing

So we started by saying that people are getting hung up on what makes a closure a closure, and there has been less emphasis on the benefits of using functions as first-class values. Did you notice that our folder function actually includes a non-trivial closure?

If you look at the fold_with_acc function, it makes use of binary_function, a variable from its enclosing lexical scope. This is not possible with the current version of Java: if you translate this to Java, when you make fold_with_acc and anonymous inner class, you will have to copy binary_function into a final member to use it. It simply won’t compile if you try an idiom-for-idiom translation, even adding explicit types.

And then if you look at the anonymous function it returns, lambda { |list| fold_with_acc.call(default_value, list) }, that anonymous function uses default_value,another variable from the enclosing lexical scope. Once again you will have to fool around with final variables to make this work, or perhaps declare full-fledged object with constructors.

(If you try writing this simple example out in Java, you quickly find yourself inventing a lot of classes or interfaces. And they have some complicated types, like a function taking an integer and a function taking two integers, returning a function taking a list of integers and returning an integer.

After twenty minutes of that, you understand why the ML and Haskell communities use type inference: If the types are that complicated, it’s incredibly helpful to have the compiler check them for you. Yet if the types are that verbose, it’s incredibly painful to write them out by hand. Even if your IDE were to write them for you, they take up half the code, obscuring the meaning.

You also get why the Ruby on Rails community doesn’t care about type checking: types for CRUD applications are way less complicated than types for first-class functional programs.)

That’s Interesting

Part of the interest in closures is in simplifying the syntax around functions, and part of the interest is in the way that access to enclosing scope would simplify a lot of code. There’s a whole debate around the value of simplification in a world where all serious languages are Turing Equivalent.

I hope you’re convinced, by now, that programming languages with first-class functions let you find more opportunities for abstraction, which means your code is smaller, tighter, more reusable, and more scalable.
—Joel Spolsky, Can Your Programming Language Do This?

For me, simpler is just nicer until something reaches a certain tipping point: when it becomes so simple that the accidental complexity of using it goes away, I will start using it without thinking about it. Tail optimization is like that: as long as recursion is slower than iteration and sometimes breaks, I have to think about it too much. But when I’m not burdened with “except…” and “when performance is not a factor…” it becomes natural.

And then something interesting happens. It changes the way I look at problems, and one day I see a whole new way to do something that I never saw before. Functions as first class values are definitely one of those things that change everything.

Further Reading

If this has whet your appetite for more, Structure and Interpretation of Computer Programs is the book on higher-order functions and how they can be used as building blocks to create more elaborate abstractions such as object-oriented programming.

The Seasoned Schemer devotes an entire book to the uses of functions. Although the examples are in Scheme, the language is dead simple to learn and the techniques in the book can be applied to Ruby and Java (or at least to a future version of Java where you do not need functors).

The second edition of Programming Ruby is an indispensable guide. Even if you will not be using Ruby immediately, pick it up and discover why so many people are lauding the language's simple, clean design and powerful Lisp-like underpinnings.

As the author says, “Higher-Order Perl: Transforming Programs with Programs is about functional programming techniques in Perl. It’s about how to write functions that can modify and manufacture other functions.

“Why would you want to do that? Because that way your code is more flexible and more reusable. Instead of writing ten similar functions, you write a general pattern or framework that can generate the functions you want; then you generate just the functions you need according to the pattern. The program doesn’t need to know in advance which functions are necessary; it can generate them as needed. Instead of writing the complete program yourself, you get the computer to write it for you.”

It’s worth reading even if you have no intention of using Perl: the ideas span languages, just as SICP is worth reading even if you don’t use Scheme at work. And be sure to read Higher-Order JavaScript and Higher-Order Ruby. They translate HOJ’s ideas to other languages.

Notable Follow-ups:

Some useful closures, in Ruby: “The Dylan programming language included four very useful functions for working with closures: complement, conjoin, disjoin and compose. The names are a bit obscure, but they can each be written in a few lines of Ruby.”

From Functional to Object-Oriented Programming: “OO allows a traceable connection between the conceptual design level and the implementation level. Concepts have names, so you can talk about them, between programmers and architects.”

HOF or OOP? Yes!: “First-class functions are a natural fit with OO, as evidenced by their presence in OO languages that aren’t glorified PDP-11 assemblers with some OO stuff bolted on the side.”

But Y would I want to do a thing like this?: “To truly learn a new tool, you must not just learn the new tool, you must apply it to new kinds of problems. It’s no good learning to replace simple iteration with first-class functions: all you’ve learned is syntax. To really learn first-class functions, you must seek out problems that aren’t easily solved with iteration.”

Labels: , , ,

 

Thursday, January 18, 2007
  Business programming standards have become higher in 2007. Learn to love it.
From time to time people suggest that fundamental computer science familiarity is irrelevant to the work of a business programmer. I am talking about a knowledge of recursion, operations on data structures, code generation, and other topics that are often derided as being “unnecessary” in a business programming context.

Hmmm.

I think this is wrong in 2007. It may not have been wrong in 2002, perhaps such knowledge was a bonus but not a basic requirement. But today, I think you need to have it. And I don’t mean, you need to have it on your resumé. I mean, you will use it on your job from time to time.

Now, I know that some readers are shaking their heads, no. And some are nodding their heads, yes. It’s easy to think this is all about culture, and some kind of weird hacker fraternity, or whatever. It’s especially easy to dismiss stuff you never use: if you never needed it before, why would you need it now?

That’s the Blub talking. Your toolbox is good enough because you've never needed anything else to do a job... up to now.

No matter what you think of Lisp, Google-style interviews (do you remember when they were “Microsoft-style interviews”?), or optimizing code, please put that aside and try to read this post as objectively as possible. I’ll lay out my thinking for you.

When you say “these things are not relevant for the job,” how do you know? Ok, you have twenty years of experience. And you’ve never used recursion or you’ve only used it once or twice. So you won’t need it now, you’ll find a way to use iteration. And who cares what a suffix array is, you’ll look it up in wikipedia if you need to implement one.

That’s what people have done for quite a while: wire existing frameworks together. What programmers need to know is how to Google stuff, and what a programmer need on her desktop is an IDE that auto-completes stuff so she’ll know what the methods are called. And of course, static typing to make sure she gets the method parameters right. Good to go.

There are jobs out there like that. Last year’s jobs.

Well, there are jobs out there like that. Last year’s jobs. But how do you know the one I’m filling in 2007 is one of those? Because I told you that for this position we are working with one of the world’s largest financial institutions on a public-facing J2EE application that has been in service for more than five years?

Now, I agree that you have figured out 98% of what we do. Most of the time, we mess with XSLT, message queues, JDBC through a DAL, and other buzzword compliant tasks. The keys to success for those items is less about programming brilliance than around discipline, process, requirements management, and the other stuff the “no hard CS” folks want to talk about in the interview.

And of course, we care about skills in those areas. We have to, we can’t hire someone who can distribute data sets across a grid but is unable to negotiate requirements effectively with a business analyst. But let’s talk about the other 2%.

The top two percent

From time to time we get some challenges. Here are some recent examples:

As part of a refactoring effort last year, we wrote some Java that used Reflection and Dynamic Proxies to replace an entire layer of the application that used to include extensive hand-coding of stuff that was repetitive and error-prone. This saves us 80-90% of the code in that layer when we add new stuff to the application. A testing utility the year before used Reflection to automatically write a certain JUnit suite.

You know how bit-twiddling in Java is irrelevant because you’re waiting for the database any ways? Well, we can’t afford to wait for this particular function, it’s one of those AJAX-y things that happens in real time. We can’t wait to go back to the database.

We’re working on something right now that is highly performant. We have a seven-figure user base, and peak loads are intense. You know how bit-twiddling in Java is irrelevant because you’re waiting for the database any ways? Well, we can’t afford to wait for this particular function, it’s one of those AJAX-y things that happens in real time. We can’t wait to go back to the database. So we have to load something into memory on the server, build a compact data structure, and traverse it quickly. And oh yes, we can’t have a lot of layers of crap, we need to get a response back to the browser with every key press.



A Little Java, a Few Patterns: The authors of The Little Schemer and The Little MLer bring deep and important insights to the Java language. Every serious Java programmer should own a copy.
For another client, we had to build a task dispatching system. It was like building a piece of a very lightweight fault-tolerant operating system. That operated across data centres in three different cities, moving jobs around from centre to centre. If you were in an interview and someone posed one of those hypothetical “how would you design a …” problems, would you think they stole the problem from some Amazon programmer’s weblog? Would you think “you don’t need that for business programming?” Well, we built that. For an ordinary, brick and mortar business that makes physical stuff.

That 2% figure? That was in 2006. In 2007, it will be way more. Standards are rising. We’re doing more and more work that steps outside of the usual CRUD development.

Here’s the big reason why. Have you read people grousing about interviews where they’re asked about how to implement a search? And about what a waste it is because 99% of the time the database does it, and the rest of the time they stick it in a hash table? Well, in 2007 search matters. The database is a big part of that, but it’s not as easy as SELECT foo.* FROM bar WHERE ... any more.

The Google Effect

Google has become the “start page of the internet.” As a result, everyone now thinks that the way to find stuff is to do a full text search. Everyone thinks that relevant results should be first. And I mean everyone, not just your “early adopter” users, you now have Joe Average calling your customer support hotline and complaining that the search page on your application—the one with a different field for each column in each table—is too hard and why can’t he just type something and get an answer?

This stuff isn’t rocket science. And you don’t need Common Lisp or Haskell to pull it off. You can do it in Java or C#.

Just how do you plan to implement full text search? Buy it from Google or Oracle? And do you think you can do the “Google Suggest” thing with the drop down? In real time?

Users now love having a single search box. They don’t want to have one box for searching on product SKU and a drop-down for searching on supplier name. One box. And if you want to make searching for supplier name easy, give them an auto-complete. Users don’t like to be given an application that basically has the implementation protruding into the interface.

They don’t care that you store first name and last name in separate columns, they want to search for “Reg Braithwaite” and find him, even if “Reginald” is what’s stored in the first name column and there are 3,215 Braithwaites in the table. You figure it out, possibly by word stemming, possibly by statistical analysis. Or maybe you’re just doing some untrained bayesian classification to cluster the “Reginald Braithwaite” record with some other things the user is looking at right now so you put that record at the top of the list you returned.

Hmmm, we’re not in Kansas any more. It isn’t all about has_one, has_many, or has_and_belongs_to_many, and you don’t have to be a high-profile start up to care. Jane Average uses stuff like this when she reads her mail and books her vacation. But does her office HR support application work half as well? Why not?

This stuff isn’t rocket science. And you don’t need Common Lisp or Haskell to pull it off. You can do it in Java or C#: we do it and there are thousands of places just like ours where people just like us are doing it every day.

But in 2007, you do need to let go of the idea that all we’re doing with “business programming” is building web applications that are replacing the client-server applications of the eighties and nineties that themselves replaced the green screen terminal applications of the seventies and eighties.

The “leading edge” interface and ideas employed by Google, Amazon, eBay, and Yahoo! are suffusing our culture to become the standard user interface of web applications. And programming the standard user interface is a basic job requirement. Learn to love it.



Do you love applications like Google Mail? Would you like to write stuff like this, even if it’s less than 100% of the time? But are you looking for a stable company working on stuff you can explain to your neighbors? Michael Lucas is hiring intermediate and senior developers for positions in Toronto, Canada. To be considered for a position, please send Michael an email with your answer to the following question:

Name three features from public web ‘sites’ like Google, Amazon, and YouTube (you can pick any site or sites you like) that will make the jump to business applications in 2007.

Labels: ,

 

Monday, January 15, 2007
  What I've Learned from Sales, Part II: Wanna Bet?
In Part I of “What I’ve Learned from Sales,” Don’t Feed the Trolls, we looked at why resistance to a new idea is expressed as a never-ending series of objections. We looked at one powerful way to avoid objections, by identifying a real, urgent problem that needs to be solved. The next installment, Part III, How to use a blunt instrument to sharpen your saw, describes the mind-set that there are opportunities for improvement to be found everywhere.

In this part, we’re going to disregard my advice to avoid objections and talk about one way to respond to many of the objections raised against new ideas.


Can we overcome objections?

Right off the top, I want to say that I don’t believe you can “overcome” an objection by frontal assault. And furthermore, you shouldn’t try. You cannot persuade someone to consider an idea by debating them into submission.

My belief is that you must discover and address their true problem. The first reason to do so is that if they do not have a problem your idea can solve, there is no reason for them to “change for change’s sake.” The second reason is that if you are not solving a real, genuine problem for them you will get caught defending an idea against an endless series of objections.

That being said, there are some circumstances where it is important to respond directly to objections. Even if we have carefully qualified someone’s problem and explained how a new idea will solve their problem, a prudent person will analyse the idea carefully, looking for fatal defects that could prevent it from solving their problem.

Another common circumstance is when there are several parties involved in presenting, discussing, and analysing an idea. Although one of the parties may be bringing up irrelevant objections to resist the idea, you may need to persuade another of the parties that these irrelevant objections do not have merit.

For example, you may be suggesting that agile meta-programming will solve a company’s problems to the CEO. You may have carefully qualified the CEO’s priorities. But the company’s IT department will raise objections because your proposals do not address their personal and departmental problems. So you will need to respond to some of their objections as part of a campaign to neutralize their influence.

What is overcoming an objection?

Overcoming an objection is, to borrow a phrase from law, “A sword, not a shield.” When you overcome an objection, you point out that the reason for not considering your new idea is fallacious, or does not apply in this case.

However, overcoming the objection does not actually provide a reason to change: it merely removes a reason not to change. This is why Part I goes on and on about discovering an immediate problem your idea can fix. If you overcome an objection but have not presented a compelling reason to change, nothing will happen.

In other words, if someone says “I like your idea, however...” you should attempt to overcome the objection. If someone says “Your idea stinks because...” you really need to gidentify a problem to solve.

One model for responding to common objections


A gotcha objection is a proposition that your idea is false based on a premise that is true infrequently or only true for some small number of cases.


Let’s assume you are going to respond to an objection, and you have good reason for doing so. Here’s one common form of objection, and a method for responding to the objection. In essence, I am going to share a pattern with you.

First, let’s look at one of the most common kinds of objections. Here are some examples. They all have something in common:


What these objections all have in common is what I call the “gotcha!” fallacy. The underlying assumption is that if your idea does not work 100% of the time, on 100% of the cases, it is no damn good. And thus I call an objection based on the gotcha fallacy a Gotcha Objection. A gotcha objection is a proposition that your idea is false based on a premise that is true infrequently or only true for some small number of cases.

P is for Pharmaceutical

Think about therapies in medicine. None of them are deterministic! Every pill, every technique, every therapy is described in probabilistic terms: When compared to the control group who drank a glass of red wine daily but did little exercise, 36.7% of those who combined daily exercise with a glass of red wine had an average improvement of 22.1% in their combined evaluation scores for cardiovascular health.

Try this the next time you’re at the doctor’s office: point out to your physician that you have heard that some people who exercise drop dead right after their daily run. Use this as an excuse not to exercise.

Now, software development is not the same thing as medicine, and I am not suggesting you respond to criticism by saying that since some drugs do not work for some patients that your ideas have merit regardless of evidence or suggestions to the contrary. But I will walk you through the reasoning that leads to the same conclusion.

How to respond to gotcha objections

Our pattern for responding to these gotcha objections is to establish that software development is probabilistic in practice. Once we establish this premise, we then turn the debate from whether there are cases where a particular practice does not appear to be optimal to whether the overall results of applying that practice is better than not applying that practice.

That’s it, and if you’re in a hurry you can stop right here: everything else is an elaboration of this idea.

Exempli gratia: the technique in action

Objection: “Sometimes you are gonna need it, so you’re wasting time if you don’t build it in from the beginning. Therefore, we should build what we know we’ll need.”

Response: Well, sometimes you need it, sometimes you don’t. And when you don’t need it, you save the code you would have written as well as all the other design that becomes coupled to the code you end up throwing out. Of course, sometimes you end up throwing out some stub code, but think about the possibility that you will wind up getting more features done, earlier in the cycle where we can get feedback and reduce risk. Why don’t we look at whether, in aggregate, more projects will succeed using YAGNI than will succeed using “Build everything we might need”?

Objection: “Some bugs simply can’t be found through automated unit tests, sorry. This is why we have to use a programming language with static typing.”

Response: Sure enough, some can’t be detected with automated unit tests. But our choice of a dynamic language provides us with many other benefits, most especially in the areas of metaprogramming and code reduction. By writing less code, we may even have fewer bugs overall. Shouldn’t we try to compare similar projects written in a static language against those written in a dynamic language, and see whether the projects in the dynamic language had fewer bugs and whether the projects written in the dynamic language were more likely to be successful?

Now that you have seen the results of applying the technique, we will patiently examine the reasoning in detail.

Theory D and Theory P in Software Development


Theory P states that the time and effort required to measure all of the variables influencing a software development project precisely enough to predict the outcome with certainty and in advance exceeds the time and effort required develop the software.


There are two schools of thought about the practice of managing software development (the theory of managing software development is of little use to us because “the gap between theory and practice is larger in theory than it is in practice”).


Critical Chain explains project management from the ground up in probabilistic terms. It's a significant improvement over the classical approach for managing risk and uncertainty in software development.
One school is that everything is fully deterministic in practice (“Theory D”). If development appears, from the outside, to be probabilistic, it is only because we haven’t discovered the “hidden variables” that fully determine the outcome of software development projects. And, since we are talking about development in practice, it is practical to measure the variables that determine the outcome such that we can predict that outcome in advance.

The other school of thought is that development is fully probabilistic in practice (“Theory P”), that there are no hidden variables that could be used to predict with certainty the outcome of a software development project. Theory P states that the time and effort required to measure all of the variables influencing a software development project precisely enough to predict the outcome with certainty and in advance exceeds the time and effort required develop the software.

Theory P does not mean that software development cannot be managed in such a way that the desired outcome is nearly certain: the flight of an airplane is fully probabilistic as it encounters atmospheric conditions, yet we have a huge industry built around the idea that airplanes arrive at their destinations and land on the runway as planned.

Which theory fits the evidence collected in sixty years of software development? To date, Theory P is the clear winner on the evidence, and it’s not even close. Like any reasonable theory, it explains what we have observed to date and makes predictions that are tested empirically every day.

(Sidebar: do not confuse Computer Science—the study of the properties of computing machines—with Software Development, the employment of humans to build computing machines. The relationship between Computer Science and Software Development parallels the relationship between Engineering, the hard science of the behaviour of constructions, and Project Management, the employment of humans to construct engineered artefacts.)

The first response to the gotcha objection


The race may not always be to the swift nor the victory to the strong, but that’s how you bet
—Damon Runyan


Before we can go anywhere with gotcha objections, there is something you absolutely, positively must do when you respond. And it must be the very first thing you do. You must establish the fact that the premise of the objection is not universal and not predictable, it is probabilistic.

The objection is of the form that “since your idea works out badly some of the time, your whole idea is bad.” You must respond by establishing that some of the time the idea doesn’t appear to work out, and some of the time it does appear to work out. It isn’t universally bad or universally good.

If you are in a face to face discussion, you can solicit agreement from the objecting party. For example:


In a less interactive environment like a running language flame war on Usenet, you can start by simply stating that the premise is not universal.

Having established that the premise is not universal you must then establish that the cases where the premise applies are not easily distinguished from the cases where the premise does not apply. Establish that nobody knows whether the premise will apply or not until after it has happened.

We simply can’t tell in advance whether the bugs that would be caught by a static type system will end up being significant to the outcome of a project. We can’t tell in advance which constructs will end up being a waste of time. And we can’t tell in advance which people will fail when they try to pair program. (The last point is absolutely true if the people involved are not doing the arguing about whether pair programming will work. If the programmers involved do not believe it will work, they may have a point.)

This is the other aspect of establishing that the premise is probabilistic: not only does it only apply some of the time, but we don’t have a good way of knowing in advance when it applies and when it doesn’t.

Okay, we’ve gone through all of this dry pseudo-academic talk of theories and probabilistic development. Time for a vacation to Las Vegas.

Casino Royale, follow link for rights information.
How do casinos make money?


The casino’s strategy is so secure, there is just one danger to its profits: if the casino plays very few games for very high stakes relative to their capitalization, they could lose their capital.


Casinos make money by wagering money on the outcome of games with gamblers. (If the gambling industry offends you, I apologize. We could choose to look at how insurance companies make money, it is entirely the same thing.) The games are arranged in such a way they the casino holds a small mathematical edge on each play. Over the long run, with many gamblers playing the games many, many times, the casino inexorably makes money. The casino may lose games here and there, and some gamblers may enjoy temporary winning streaks, but overall, the casino wins more than it loses.

The very briefest exploration of statistics reveals the following facts about the casino’s strategy for making money:

  1. The casino must have an edge on each play;

  2. The more games played, the more likely that the casino will profit overall;

  3. Runs of good luck for some gamblers are offset by runs of bad luck for other gamblers.


Henk Tijms’s Understanding probability explains probability in simple terms requiring very little mathematics. Examples drawn from everyday life include analyzing investment returns, lotteries, and gambling. The book continues to build on the basics, worthing through Bayes' theorem up to multivariate and conditional distributions. A must-read for those working with data or seeking to understand risk analysis.
In fact, the casino’s strategy is so secure, there is just one danger to its profits (besides the obvious fear of losing their license to print money): if the casino plays very few games for very high stakes relative to their capitalization, they could lose their capital. One celebrated “whale,” Akio Kashiwagi, won more than $19 million in one casino and on another occasion won $6 million in a siangle session playing baccarat for $200,000 a hand.

Therefore, the casino’s prime safeguard is to avoid risking large amounts of capital on games played very few times.

To summarize, the casino’s strategy is to:

  1. Arrange a small advantage on each game played;

  2. Play a very large amount of games;

  3. Ignore good and bad runs, they will offset each other;

  4. Do not risk large amounts of capital on games played very few times.

Now that we know the casino’s strategy, let’s consider how they evaluate games. Imagine them sitting around a conference table, and someone suggests, “Let’s create a new game, Alaska Freeze.” What do they use to evaluate whether to add this new game to their casino?

Space in a casino is at a premium. If Alaska Freeze goes in, something else comes out. So there has to be an Incremental Value calculation: do they make more money with Alaska Freeze in and something else out? Or less? This calculation has two simple components: does Alaska Freeze have a larger or smaller mathematical edge than whatever it replaces, and will Alaska Freeze be played more or less often than whatever it replaces?

Ok, let’s return to handling gotcha objections. I’m sure you knew all of this, I was just presenting it in a palatable morsel so you can feed it to someone objecting to your idea.

From casinos back to objections


The process of developing software is just like the business of running a casino.


You are handling an objection and you have established that its premise is neither universal nor predictable, it is probabilistic. Well, if it has an uncertain outcome, it is just like a casino game. And the process of developing software is just like the business of running a casino.

So the question is not whether a new practice ever has a case where it appears to “lose,” for the same reason that evaluating a new game for a casino does not involve worrying about whether gamblers will ever win.

The way to evaluate the idea is to examine it and see whether it fits the casino model:

  1. Does it offer a probable advantage each time it is employed?

  2. Can it be employed many times?

  3. Do streaks of “losses” and “wins” offset each other?

  4. Can we avoid risking the entire outcome of the project infrequent events?

And if it does, to identify what idea or practice it replaces to determine whether it is a net win or a net loss overall:


Now, I am not going to say that XP, YAGNI, or dynamically typed programming languages necessarily fit the casino model and are necessarily better than the practices they replace (Classical Project Management, BDUF, and static typing). But what I will say is that there is a huge difference between saying “some of the time, for some of the people, that idea loses” and saying “overall, when applied to an entire project, the project does worse than whatever idea it replaced.”

So to handle a gotcha objection, we establish that it is probabilistic in nature, then we analyse it as we would analyse any other probabilistic practice: we look at the overall effect of the practice on the entire project, comparing it to whatever practice it would replace. And I have some easy-to-remember phrases for doing that.

The second response to the gotcha objection

The next step is rather obvious. You have to state the payoff those times that your idea or practice “wins.” And to be fair, you also have to agree to the cost of your idea or practice when it “loses.”


Needless to say, if you are in a face to face meeting you should solicit agreement to this second response as well. If you can’t establish that there are any benefits to your suggested practice, you have a great deal more work to do to handle this objection.

The third, and final response to the gotcha objection

The final step is the clincher. Having established that the objection’s premise is probabilistic, and that for those times the idea or practice “wins” there is a positive payoff, it’s time to compare the overall benefit of the idea or practice to whatever it replaces. You want to shift the debate from debating the premise to debating the overall benefit.

And in fact, there are two different forms of the clincher. You can use either, or preferably both:

  1. Ask whether you can balance the benefits of using the practice when it wins against the cost of using the practice when it loses, and evaluate the overall benefit in comparison to the benefits and costs of the alternative and see whether you would get more of a benefit over one entire project, or;

  2. Ask whether you can compare the success rate of teams using the practice to the success rate of teams using the alternative in aggregate, or;

  3. Both.

And here are the example responses again, demonstrating the three forms of clincher:

Objection: “Sometimes you are gonna need it, so you’re wasting time if you don’t build it in from the beginning. Therefore, we should build what we know we’ll need.”

Response: Well, sometimes you need it, sometimes you don’t. And when you don’t need it, you save the code you would have written as well as all the other design that becomes coupled to the code you end up throwing out. Of course, sometimes you end up throwing out some stub code, but think about the possibility that you will wind up getting more features done, earlier in the cycle where we can get feedback and reduce risk. Why don’t we look at whether, in aggregate, more projects will succeed using YAGNI than will succeed using “Build everything we might need”?

Objection: “Some bugs simply can’t be found through automated unit tests, sorry. This is why we have to use a programming language with static typing.”

Response: Sure enough, some can’t be detected with automated unit tests. But our choice of a dynamic language provides us with many other benefits, most especially in the areas of metaprogramming and code reduction. By writing less code, we may even have fewer bugs overall. Shouldn’t we try to compare similar projects written in a static language against those written in a dynamic language, and see whether the projects in the dynamic language had fewer bugs and whether the projects written in the dynamic language were more likely to be successful?

Good luck handling objections. What is your experience: do you have another technique you can recommend?

A Personal Note

As Mike pointed out in the first comment, this post explains how to handle this one type of objection, the gotcha objection, by moving the debate away from the exception case and towards the overall case. But it does not follow up by presenting hard data to justify the example ideas presented.

First, I want to say that even with hard data you will not foster change with numbers: you need to show how your idea addresses an urgent priority. That should have happened before you got to this point. If you have addressed the problem correctly, it really is sufficient to point out the fallacy in the objection and allow your original argument to stand.

Second, there is a dearth of hard data about anything to do with software development. Repeat after me: “the plural of anecdote is not data.” If you have a source of hard data about any practice, be it programming languages, practices, or even interviewing techniques, I would very much like to read and learn from it.

Does this mean that we should never change, that since there’s no proof that new ideas are an improvement over old ones?

If you are happy with your current situation, maybe not. If you are unhappy with your current situation, if you want things to be better, you may want to change something. It’s your call.

Labels: ,

 

Sunday, January 14, 2007
  What I've Learned From Sales, Part I: Don't Feed the Trolls
This is the first part of “What I’ve Learned From Sales.” In this part, “Don’t Feed the Trolls,” I present my explanation for why people act like “trolls,” raising objection after objection to new ideas, and I suggest how to side-step this behaviour and deal directly with their concerns.

(Part II, Wanna Bet?, describes how to handle one very common form of objection. Part III, How to use a blunt instrument to sharpen your saw, describes the mind-set that there are opportunities for improvement to be found everywhere.)


“Oh no,” you must be thinking, “another Guy Kawasaki wanna-be trying to tell me how to sell things.” Well, yes to the wanna-be accusation, but no to the proposition that this is a general-purpose article about sales. I am not going to pretend to tell you how to promote your business, turn your product idea into a money-maker, or even how to sell yourself to an employer.

What I am going to share with you is some experience I have had with sales that strongly parallels my experience discussing new ideas with people. (I know, reasoning by analogy is often faulty. But it’s what we humans do, we’re pattern-matching machines.) If you find that people seem unreasonably resistant to good ideas like more powerful programming languages, putting people before process, or valuing working code above documents, you may find this helpful.


Guy Kawasaki’s The Macintosh Way explains how to create and evangelize great ideas, whether they are products for sale or world-changing movements.
Our model here is that the mental process of considering a new idea is the same as the mental process behind buying something. If you are discussing a new idea with someone—even if you aren’t actively trying to “sell” it to them—they are still going through the buying process. And if they have trouble accepting the idea, they will resist, or in sales jargon, they will “raise objections.”

Do you think this is specific to sales? No, when we see new ideas like the Ruby programming language, we encounter objection after objection. Some are ironic: Some Java enthusiast objects to Ruby on performance grounds: perhaps they are too young to remember when the C/C++ folks objected to Java on performance grounds? Or perhaps it will be the IDE support objection, or the Not Invented By Microsoft (a/k/a Not a Corporate Standard) objection, or any of a million others that are not completely unreasonable, but are also usually irrelevant.

What is an objection?

On the face of it, an objection is an expression of a discrepancy between your idea and what someone wants. So if they say “Lisp has too many parentheses,” you might think that they are saying “I would use Lisp if it didn’t have so many parentheses.”

The great secret we can learn from sales is that this is not true. As we will see below, people say what they think other people want them to say. So someone might be thinking “Lisp is too hard for me, all this talk of let and lambda and closures and tail recursion is confusing.” But they are embarrassed to admit this, so they seize on something that sounds more reasonable, like “the syntax is weird.”

We need to understand this, because the absolute worst thing you can do with an objection is answer it directly. If someone is really thinking “Lisp is too hard,” what good does it do to try to persuade them that parentheses are their friends? They don’t really care. Worse, if you trot out the benefits of homiconicity and its applications to macros and introspection and meta-circularity, you’re actually making Lisp sound harder, not easier.


At one time I was a Macintosh salesperson. I used to sell Mac SEs and Mac IIs in “The Dark Times” after Steve Jobs was expelled from Apple by the vile and treacherous Prince John…


Let me give you an example from my own experience in sales. At one time I was a Macintosh salesperson. I used to sell Mac SEs and Mac IIs in “The Dark Times” after Steve Jobs was expelled from Apple by the vile and treacherous Prince Johnbut I digress. I was a Macintosh salesperson at a time much like this time: nineteen out of twenty computer sales were PCs.


Revolution in the Valley tells the incredible story of the creation of the Macintosh—from the perspectives of the people who were actually there. It’s packed with behind-the-scenes anecdotes and little-known secrets. Much of the material is available on line for free.
You would think that selling Macintoshes would be a lonely existence. But no, the phone would ring regularly and customers would visit the office on a daily basis, always with the same question, Why should I buy a Macintosh instead of a PC? And at first, I would answer this question. Macintoshes were superior to PCs in every way. (Actually, this was technically very true at that time. Why, you could run as many as six monitors on a Macintosh II! But I am digressing from my digression.)

I learned a funny thing about answering people’s questions. I would answer their questions, and they would argue with me. I would say that you could run multiple monitors on a Macintosh II, making yourself more productive. And they would say “but I don’t need to be that productive, so that doesn’t count.” Or I would say that the mouse and windowing interface is easier to use, you can learn to use more programs. And they would say “but all I need is a word processor and a spreadsheet, so I don’t need to learn new programs.”

(Why would they do that? It was so that if someone asked them, “did you shop around and make an intelligent decision?,” they could reply, “why yes, I shopped around, I checked out Macs and PCs, I did a lot of research, and surprise surprise, I ended up with the exact same thing that my neighbour Bob has, except mine is running at 12Mhz and poor Bob is stuck with 10Mhz.” It may sound to you like they are doing an awful lot of work just to be able to say that one thing with a straight face, but I can tell you that there is this multi-billion dollar automobile industry that works on this principle: people want to be a little better than their neighbour, but not so much better that they are different than their neighbour.)

I can give you many more examples, but the interesting thing is not whether people wanted this stuff or not, but no matter how convincingly I answered their question, they would just ask another one. Their questions had nothing to do with how they were making up their mind.


What people think often bears no relationship to how they behave


I learned very quickly that what people think often bears no relationship to how they behave. People usually say the things that they think other people expect them to say, but they go ahead and do whatever it is they always wanted to do. In the case of buying computers, my observation is that most people want to buy whatever it is that most people are buying. They want to belong, to fit in. So they are going to buy a PC. Or an iPod.

So what was really on their mind was fitting in, even though they argued about the Macintosh’s technical merits.

The lesson I learned is that before we can introduce a new idea to someone, we first need to understand what is really on their mind.

Salespeople call this “uncovering the hidden objection”. They have all these elaborate techniques for figuring out what’s really on a prospect’s mind when they encounter resistance to the sales pitch. I’m not going to suggest we do that. Instead, I’m going to suggest we avoid the objections in the first place by “qualifying.”


The most important principle of effective selling is that qualifying the customer is more important than overcoming objections.


What is “qualifying” and why is it the most important step in the sales process?

Many people say the most important step is “closing,” the art of getting the prospect to hand over the money, the end of the sale. If you judge by the behaviour of people selling time shares, fitness club memberships, and automobiles, this is the only thing salespeople work on: haranguing prospects and ‘overcoming objections’ by arguing.

Experienced and successful salespeople follow a different path. Experienced salespeople believe that the most important step is qualifying, the art of discovering whether a prospect has an actual need for the product, the beginning of the sale. Salespeople who are strong qualifiers spend almost no effort on closing because they are always working with prospects who want to buy.

On the other hand, if you do not discover their real problem, you extol virtues that have no attraction for them and neglect to address any perceived issues in their mind. The principle at work is that if you know what someone really needs, you address their needs from the very beginning. When you arrive at the conclusion, you have already addressed any questions they may have.

This is true for sales. It is also true for new ideas. If someone fears that an idea like learning Lisp (or meta-programming, or designing a program in the technical interview) would be too difficult for them, you will only be successful if you first explain how easily they will learn the new idea, and only then explain how wonderful the idea is.


If someone doesn’t have a headache, you cannot establish the value of an aspirin for them… Don’t focus on how you think your new idea can help them be better. Instead, focus on whether they have an urgent problem that your new idea can fix.


Although there are various models for understanding people’s motivations—such as Maslow’s Hierarchy of Needs, or the Greed-Belonging-Exclusivity-Fear Quartet—for development tools and methodologies my experience is that the simplest model fits best: people are motivated to solve their problems: if you can identify a problem they think they have, you can show them how to solve it.


In On Intelligence, Jeff Hawkins explains how the human neocortex matches visual, audible, and kinaesthetic patterns—and replays them to form the basis of prediction. He makes a convincing case that the neocortex is the single most important distinction between humans and other species… and therein explains what makes humans human.
People without problems are not good prospects for lightweight development methodologies, new development tools, programming languages, or any other “change for the better.” Just the other day I was lunching with Dmitri from Opalis. We were talking about a development tool I am trying to build, and Dmitri was suggesting that it was a “vitamin” and not an “aspirin”.

I was taken aback. Isn’t improving software development important for everyone? Then I remembered my sales training and asked him about how things were going at Opalis. Dmitri admitted that his team was performing well and that he had built a lot of trust with his organization. So he didn’t have a problem. Quite simply, if someone doesn’t have a headache, you cannot establish the value of an aspirin for them.

Now, even Dmitri’s team has room for improvement, so it is not correct to say that there is no value in improved methodologies, tools, languages, or anything else for him. However, such things may not be a priority right now. This is exactly the same case as trying to sell a Macintosh to Bob’s neighbour: I believed that absolutely everyone could have benefited from owning a Macintosh. However, Bob’s neighbour didn’t think he had a usability problem, he thought he had an urgent “keeping up with Bob” problem.

And there’s the key: Don’t focus on how you think your new idea can help them be better. Instead, focus on whether they have an urgent problem that your new idea can fix.

Discovering their priorities shouldn’t difficult. Why don’t we simply ask them? Well, there’s a trick to asking someone about their priorities. Remember, they will tell you what they think other people want to hear, not what they are thinking. Here’s an example concerning agile development:

Agilist: “What’re your priorities for the development team in the next 60-90 days?”

Manager Says: “I have a total commitment to process improvement and faster response to business initiatives,” …but thinksCMM Level Four—or, God willing, Level Five—will get me a higher profile and a shot at the CIO position. I need some consultants in here to start imposing some bondage and discipline over our development processes.

The trick is to get specific and objective. Never take objections as evidence of their real needs, and never accept vague feel-good values at face value. Top salespeople don’t. Try calling a busy estate agent and saying you’d like to buy a house. I guarantee that the agent will ask you, “when do you need to buy a new house?” And so it is with new ideas:

You cannot position lightweight development, tools, languages, or any other type of change without being able to fit them into the specific and objective problems someone is trying to solve. You need to relentlessly pursue the immediate, urgent priority:

Startup Founder: “I’ve heard that Agile stuff is crap—it only works for star programmers who would be good no matter how you manage them.”

Agilist: “Well, there’re a lot of opinions out there. Tell me, what would you say is the most pressing issue facing your development team right now?”

Startup Founder: “Well, we have hacked together some great stuff, but we need to scale, and to scale without imploding we’ll need some discipline, some real management of the development team. That’s why we need a real methodology.”

Agilist: “I can understand the importance of scaling up. So, have you set some specific objectives for scaling up over the next month or two?”

Startup Founder: “For the next couple of months? Oh, it’s all about recruiting, definitely recruiting. I need another two or three top people to work on a new project that could be worth millions. We’ve identified some good candidates, but it’s very difficult to get them to accept an offer from a start up.”

Agilist: “You know, we really ought to consider whether using Agile might help you recruit—have you considered the possibility that some star candidates might be looking for an environment that is more Agile than the one they are leaving?”

Startup Founder: “Hmmm…”

As you can tell, once you have a specific problem with specific dates attached to when the problem needs to be resolved, you can discuss a specific solution. You’ve side-stepped the useless “objections.” One more time: do not accept vague objectives, get specific objectives with near-term dates attached to them.

If someone really doesn’t have any applicable near-term objective, you will not be able to introduce a new idea to them. So don’t be surprised if they express very little interest. But when you have an immediate, specific objective in hand, you can position the idea as a solution to their problem.

And that works for almost any idea. Say you had a new programming language designed for set-top boxes. But it turns out nobody has a “programming set top boxes” problem. So they raise objections about the speed of your virtual machine, or the fact that programmers cannot manage memory in your language, so they cannot squeeze programs into very small spaces.

Should you keep pounding away at that? Or go looking for an immediate problem people have, like building web applications?

If you side-step their objections—like memory management—and get to the root of their immediate needs, you might be able to introduce a new popular programming language. Good luck!

Part II, Wanna Bet?, describes how to handle one very common form of objection. If you liked this post, you might also like a related post, The false dichotomy of choosing between your values and expediency.

Labels: ,

 

Saturday, December 16, 2006
  Lisp is not the last word
Ken Tilton asked: What is up the power continuum from Lisp?

I don’t have a ready answer. However, just because I don’t have an answer doesn’t mean I don’t believe there’s an answer. It could be that Lisp is a little like Democracy. It could be the least powerful programming language possible, excepting all of the others invented so far. But you know what? I have faith we can do better.

Ken doesn’t say there isn’t a language up the power continuum. And I won’t say we have already invented one: like Ken, I’ll pose a question: what law of computer science places a limit on the power continuum at Lisp?



G.J. Chaitin explains his proofs of Kurt Godel’s incompleteness theorem and Alan Turing’s “halting problem” in computation. Chaitin’s creative use of Lisp in mathematics and fervent belief that no theorem is proof against new analysis are welcome shots of espresso.
Human history is chock-a-block full of inventions and practices that were considered for decades or even centuries to be the final word, the ultimate expression and implementation of ideas. And then someone came along and demolished everything. Geocentricity. Heliocentricity. Newtonian celestial mechanics. Light as a wave. Light as a particle. Three dimensions. Uniform space. Euclidian geometry.

Some of these new ideas took years to take root while the establishment derided them as “not even wrong.” Others were so obviously right they immediately displaced what had come before. We might now have invented a more powerful language. Or we might have invented one but not realize it yet. But who can say that we haven’t invented a more powerful language and will never do so?

If you believe there is a power continuum, if you are not so obsessed with Turing Completeness and theoretical equivalence, what is the argument that it has any limit whatsoever, let alone that its limit is Lisp?

I believe that the only language that is affixed to the top of the power continuum is Blub. For everyone whose imagination soars above the ceiling of their laboratory, Lisp is not the last word.

Labels: ,

 

Wednesday, November 22, 2006
  The significance of the meta-circular interpreter
A self-interpreter is a “programming language interpreter written in the language it interprets.” A meta-circular interpreter is a special case of a self-interpreter that applies only to programs where the primary representation of the program is a primitive data type in the language itself (this property is called homoiconicity). Lisp is such a language because Lisp programs are lists of symbols and other lists. XSLT is such a language because XSLT programs are written in XML.

(If you have ever written an XSLT that transforms other XSLTs, then you immediately grasp the advantage of a meta-circular interpreter over an ‘ordinary’ self-interpreter: it is not just possible, but it is easy to write programs that write programs, because you don't have to fiddle with transforming each program into an abstract data structure (typically a tree) that can be manipulated by your program.)

This is interesting to both language theoreticians and hobbyists. But does it matter to those of us trying to get things done? What is the significance of meta-circular interpreters and self-interpreters?

The short answer is that if you are working on meta-programming, a self-interpreter makes all things not just possible, but practical. The lack of such an interpreter places a limit on how much you can accomplish using your implementation language.

Meta-programming

Let’s start our examination of the significance of self-interpreters with a review of meta-programming, or bottom-up programming. This is the practice of constructing a programming language tailored to your problem space. You get your language’s basics working by building it on top of a base, or implementation language, and then you build your solution in your solution language.

While meta-programming, you are working on two tiers either simultaneously or alternately: you work on expressing your solution in your solution language, and you work on the implementation (whether that be fun stuff like making it expressive or plumbing stuff like making it fast enough to be practical) in your implementation language.

An important and popular sub-domain of meta-programming is the practice of writing Domain-Specific Languages or DSLs. DSLs are solution languages tailored to resemble as closely as possible the human jargon of “domain experts.”

(Obie Fernandez has an excellent presentation on the rationale for and construction of DSLs.)

DSLs are commonly cited as useful in areas where programmers who are not domain experts collaborate with domain experts who are not programmer. In my own experience, DSLs are also valuable even when the programmers are themselves the domain experts. My rule of thumb for whether a DSL would be worthy of consideration is to ask how two programmers discussing the solution to a problem over an IM client would talk. If their language would closely resemble the natural constructs and idioms of their chosen implementation language, there is no need for a DSL.

This is not always the case. A very classic example is that of SQL. SQL is a DSL designed for programmers to express relational algebra rather than the imperative steps for performing queries and updates. Although complex cases are impenetrable to the journeyman, its general form closely follows the way even a non-technical person would express their thoughts about data that is stored in tables.

Another example of a successful DSL for programmers is the regular expression engine present in almost every language (whether baked in or as a commonly available library). Programmers do not discuss searching for text patterns in terms of backtracking and lookup tables and loops unless they are implementing a search library themselves.

Programmers talk about matching the string ‘Nokia’ followed by four digits, a forward slash, and two numbers separated by a period in the User-Agent Header. Regular expressions, while imperfect, match the way programmers think and talk about text matching much more closely than writing out the most efficient steps using strstr and for loops.

What about the meta-circular interpreter?

Meta-circular interpreters are nothing new. Lisp is most famous for its meta-circular interpreter. But do you know that one of the world’s most popular programming languages has the next best thing, a self-interpreter? The C programming language’s compiler is written in C. That’s interesting. But why is it significant?

Well, instead of heading up the ‘power continuum’ and talking about Lisp, let’s stay with C for a moment. If you’re a C programmer and you become very, very interested in building a better programming language, you have in your hands the tool to make changes as you see fit.

You might, for example, start with a pre-processor and implement the first version of your C++ language by mechanically translating a C++ program to C. Or you might bootstrap your C++ language by writing a compiler for C++ in C. Because you have in your hands all of the tools for going from source to running program, you can enhance and change its behaviour exactly as you please.

When a language is not implemented in itself, you have limitations on your ability to create new forms. One might argue that Lisp’s macros make it possible to build any other language paradigm on top of Lisp. This is partially correct, however macros alone are not a complete answer. Macros act to rewrite local sections of programs.

You cannot—to my knowledge—take a dialect of Lisp that does not support tail recursion and use macros to execute tail calls in constant space without rewriting every function using your macro. A similar argument holds for using macros to implement continuations. You must manually rewrite every function using your macros if you want to change the global behaviour of your program.

Manually blank every blank. This sounds an awful lot like something we can automate. Automatically transforming entire programs is the province of interpreters and compilers, isn’t it? If you wish to perform transformations that are global in scope, you want a custom interpreter. Of course, you can write one from scratch in your implementation language.

But… This sounds like we are headed towards the Turing Tar Pit. Isn’t it much easier to use the interpreter that’s built in? Implementation languages that provide an interpreter or compiler for themselves provide an industrial-strength, debugged platform for the construction of solution languages.

This is the other half of the power of Lisp: if you want to change deeper fundamental language features like whether you have a Lisp-1 or a Lisp-n, or whether all evaluation is lazy, or…, or…, or… you can, because Lisp interprets Lisp and Lisp compiles Lisp.

It is this reason that languages like Ruby, which is implemented mainly in C, provide less maximum power than languages like Smalltalk, which is implemented mainly in… Smalltalk. For example, there is talk—at this time—that continuations will be temporarily dropped from Ruby 1.9. If you have an application making heavy use of continuations, what is your upgrade path?

Of course, not everyone thinks they need all of a more powerful meta-programming implementation. You will have to decide for yourself whether a language without a meta-circular evaluator or self-interpreter offers other benefits that outweigh this significant feature.

Serving the self-interpreter to your canine companion

As Simen pointed out in a comment, Ruby does have a third-party project to interpret Ruby in Ruby called Rubinus. Assuming it graduates from its current status as an experimental work-in-progress, how is this different from having a self-interpreter baked into the language?

When a language is built on top of a self-interpreter, the language designers are forced to eat their own dog food.

Now it is not correct to say that Matz does not use Ruby. He does, and so he does eat his own dog food. And for the domains where he uses Ruby, he has optimized Ruby to be a useful tool. But since he doesn’t use Ruby to build Ruby, he does not have the same incentive to tune Ruby for the purpose of building languages.

An obvious example is Ruby’s performance. It is perfectly fine for building CRUD applications. However, it lacks really high performance when working with complex data structures in memory. This is the kind of thing that an interpreter or compiler has to do when parsing code or managing cactus stacks.

Imagine what would have happened had Matz become impatient waiting for Ruby to interpret itself when he was first building the language? I suggest that the implementation and language would both be tweaked to bring performance up an order of magnitude or more. Programmers are notoriously impatient with slow tools.

The implementation is not the only thing that improves when a language designer eats her own dog food by baking a self-interpreter into the language. The design of the language itself changes. Larry Wall has said that “Languages differ not so much in what they make possible, but in what they make easy.” When a designer builds their new language in itself, the language invariably makes building languages easy.

So I suggest that the presence of a self-interpreter baked right into the language, not bolted on as an after-thought forces a language to be useful for building solution languages.

(This is not a broad criticism of Ruby, or a suggestion that Ruby is not an excellent tool for building a wide variety of useful solution languages. I’m just trying to point out the salient distinction between baking a self-interpreter into a language from the beginning and bolting one on the side.)

But I’m a tool maven, not a language maven

Of course you are. So tell me, how does Eclipse do all of its magic with Java, a language lacking a self-interpreter?

The answer is here. The Eclipse team based Eclipse on VisualAge for Smalltalk. They were more than familiar with the benefits of having a self-interpreter, so they wrote most of one in themselves, in Java. Self-interpreters are also the basis for building advanced tools.

Labels: ,

 

Friday, November 17, 2006
  The first seven books I would buy if my shelves were bare
Lucas Carlson won a $100 gift certificate on Amazon.com for his 2nd place entry at Rails Day (in collaboration with John Butler). Congratulations, Lucas!

Lucas asked for suggestions on spending the money. I tried suggesting a new iPod Nano loaded up with the SICP lectures in video podcast form. But as you would expect for someone working in a music-related venture, he has plenty of toys already.

So… here are the first seven books I would buy if my shelves were bare (in no particular order):
  1. Structure and Interpretation of Computer Programs. You can read it for free on line, but it’s even better as a physical book. One for the ages, it’s the kind of thing that ought to be bound in rich leather (if you go for that sort of thing) and kept in the library you build for your luxury castle.

  2. To Mock a Mockingbird. It seems you can’t raise micro-capital these days without understanding fixed point combinators. Here’s the most enjoyable text on the subject of combinatory logic ever written. What other textbook features starlings, kestrels, and other songbirds?

  3. The Media Lab: Inventing the Future at M. I. T.. Stewart Brand’s book captures the legendary think tank’s culture and ideas. Compare and contrast their view of broadcatch with today’s RSS feeds, or narrowcasting with today’s 500 channel television.

  4. On Intelligence A book that shook my views about how my brain works. To pick one nugget out of many, neurons are so slow that in the time it takes for us to react suddenly—say to duck a flying object—there is only time for a chain of at most 100 steps to complete. 100 steps do not permit us to perform any complex reasoning or look-up. Jeff explains how the neocortex can accomplish complex tasks using layers of parallel switches.

  5. Philip and Alex’s Guide to Web Publishing. While the technologies suggested (TCL, AOLServer) are unlikely to float your boat, this is the most beautiful technical book on my shelves. Philip’s advice on how to build software for web publishing and approach is still relevant several generations of web developers later. (Also available on line for free.)

  6. The Recursive Universe: Cosmic Complexity and the Limits of Scientific Knowledge. Although it seems to be out of print, hunt down a copy for yourself. A thrilling journey into the ideas of the great John Horton Conway and computation’s building blocks. Best of all, it’s explained beautifully using the legendary Game of Life. Who knew that puffer trains and spaceships are Turing Complete?

  7. QED: The Strange Theory of Light and Matter. Yes, the full Feynman Lectures are incomparable, and for further thrills you can listen to him give the lectures on audiobook. But in QED, Feynman does this one magical thing: he explains how a mirror reflects light. And in the process of explaining how light actually reflects off a mirror, Feynman deconstructs classical physics and rebuilds our understanding with Quantum Electrodynamics. For a moment, you can understand how little we really know about how the universe works.

Is there a book you would recommend? What’re your feelings about the books I’ve suggested?

p.s. Shane Sherman’s The 5 Books that Every Programmer Should Read

Labels: ,

 

Wednesday, November 08, 2006
  Take control of your interview
I’m helping some colleagues interview programmers, and for once I’d like to offer some suggestions to the interviewees and not the interviewers. This post is about using the interview to maximize your chance of landing the right job.

have an objective

First, you need to walk in with an objective. No, not “to secure a strategically progressive position leveraging your forward-going architectural vision and hands-on experience to advance the division’s mission,” but a simple, tactical objective for the interview.

Here’s what I suggest. Think of three things you want the interviewer to know about you that you think they are unlikely to find out if they ask all the questions.

The important ideas are that (a) you want the interviewer to know about each of the three things, and that (b) the interviewer is unlikely to ask about all three if you don’t exercise some control over the interview.

Let’s start by ruling a few things out. First, don’t bother with how many years of technical experience you have with the company’s tools and platforms. If they are using in-house Common Lisp macros compiled to C and then distributed on a grid with MapReduce, I guarantee that they will ask whether you have any Lisp or distributed programming experience all by themselves. (You can still take advantage of your experience, I’ll show you how below.)

Second, rule out anything that doesn’t really help sell them on you. Yeah, yeah, hockey is competitive and it takes a special kind of focus to be a goalie. I get that, but it really only sells when the interviewer is also a hockey player. My suggestion is put that kind of thing at the bottom of your resumé and let them ask about it if they care.

the three things

The three things you want to take into the interview should be stuff that matters to them but is hard for them to ask about. Stuff that doesn’t pop out of your resumé. I don’t have a pat formula for generating these items that I can share, but here are a couple of ideas to get you started.

Remember the technology buzz-phrases we rejected (“five years of JEE” “Common Lisp”)? Think about the difference between yourself, presumably an expert in these areas, and someone who has only worked on one project with the same technology. You both touched all the same tools and code, but there’s something special about you, your extra experience means something. What is it?

Joel Spolsky and Peter Norvig both suggest that you cannot pick up a new programming language or platform and become proficient overnight. That being said, if you tell me that you have five years of experience, I have no idea whether you have five years of experience or whether you simply did the same thing over and over again. What is it that you learned that makes you special? What secrets to you possess that can’t be listed on your resumé?

For each person, there will be different answers. One person might say that their “secret sauce” is that they have learned the ins and outs of the platform, they know what works and what doesn’t, they know how to work around the shortcomings. Another might emphasize the non-technical skills. I would personally be impressed with anyone who said that what makes them special is that they are very, very accurate when they estimate tasks and projects.

This leads naturally to another area I would mine for nuggets: Soft or non-technical skills.

99% of the dreck you read about interviewing programmers is about finding out whether they can actually write programs. Rightly or wrongly, most interviewers focus on the hard skills. But if they never get around to discovering your ability to juggle priorities, or write effective technical documentation, or analyze requirements, how will they know that you are far and away superior to the other five people they will interview this week?

You have to tell them, that’s how they’ll find out.

tell them about it

Walk into the interview with your three things. Now, the interview is a game. Presuming you don’t get thrown out or you don’t walk out before it ends, you win the game if the interviewer discovered all three of your things. If you like games as a metaphor, call them goals.

If the interviewer asks you about one of your things, that’s an easy goal. Take it.

If the interviewer asks you if there’s anything you’d like to mention about yourself, that’s an easy goal. Talk about one of your things.

If the interviewer asks you about something related to one of your things, answer the interviewer’s question and then ‘coat tail’ your thing onto the end. For example:

Interviewer: It says here you worked on JProbe, that’s a Java tool, right?

Interviewee: Absolutely. JProbe is a suite of tools for JEE Server-Side development (answers the interviewer’s question). Under my management, we released three consecutive versions on schedule. I’m really hoping that there’s an opportunity to apply my focus on hitting plan to this team (adds on).

Watch politicians “staying on message.” Now matter what question they are asked, they spit out their own pat statements. You don’t want to be that plastic, but you have to take responsibility for a a successful interview. And you know what? If the interview ends with your best features undisclosed, the company loses as well.

One more time with the coat tail:

Interviewer: What’s your experience with Ruby?

Interviewee: Well, I was the lead developer on the Certitude project. We built that with Rails and we included a fairly heavy dose of Ruby idioms, including a domain-specific language for pattern-matching and lots of dynamic meta-programming (answers the interviewer’s question). One of the things I discovered on that project was the importance of a bomb-proof quality control process when you have such a powerful language. I customized our continuous integration server to track changed files, tests, bugs in a unified report interface so we could monitor the most troublesome modules. It really saved our bacon late in the project when we had to really tighten up our risk management to ship on time (the add on, emphasizing the process).

if all else fails

Perhaps you didn’t get a good opportunity to mention your three things and the interview is winding down to a close. Don’t try a desperation coat tail where you try to stick two completely unrelated things together. Instead, try using a question to introduce one of your things indirectly.

Some examples: