Writing programs for people to read
Programs must be written for people to read, and only incidentally for machines to execute.
This is about writing programs in a style that favours human comprehension over the convenience of the machine.
1Norbert Winklareth recently raised the question of
minimizing the semantic distance between the program as written and the solution to the problem as conceived by the programmer.
Norbert was talking about comparing the capabilities of programming languages, but the idea of semantic distance is also useful for comparing programs to each other. Although this is not the entirety of writing good programs, let’s examine this idea in more detail.
Indeed, let’s look at one very simple, very powerful, way of writing programs that are as semantically close to the solution in the mind of the programmer.
Code that resembles its resultA template is a blueprint for describing the result you want, where instead of embedding data inside executable code, you turn things inside out and embed executable code inside data.
Templates are very popular in programs that generate markup:
<TITLE>Hello World</TITLE>
Hello, Example. Today's date and time is <%=Now()%>.
It’s obvious what result you want, much more obvious than if you tried the following:
page = Page();
head = new Head();
title = new Title();
title.setText("Hello World")
body = new Body();
preamble = new StringBuffer();
preamble.append("Hello, Example. Today's date and time is ");
This code produces its results as a
side effect of its execution. The code itself doesn’t directly describe the result, whereas the first example directly describes the result we wish to generate.
Sometimes you need to generate the result as a side effect of the code. You needn’t write code as opaque as the answer above, instead you can
organize your code so that its form resembles the form of the result you are generating, such as this
Scriptaculous code:
element = Builder.node('div',{id:'ghosttrain'},[
Builder.node('h1','Ghost Train'),
"testtext", 2, 3, 4,
Builder.node('li',{className:'active', onclick:'test()'},'Record')
That produces this HTML:
<div id="ghosttrain">
<div class="controls" style="font-size:11px">
<h1>Ghost Train</h1>
<li class="active" onclick="test()">Record</li>
Just like the template example, you don’t need to run a simulation in your head to try to figure out what the code produces.
Is this just cancer of the semicolon?What’s the difference between these two code samples?
preamble = new StringBuffer();
preamble.append("Hello, Example. Today's date and time is ");
"Hello, Example. Today's date and time is #{Time.now}."
Is the second just syntactic sugar for the first? No. It’s more than
just syntactic sugar. People have a habit of saying “syntactic sugar” in a dismissive way. It’s another argument that since an underlying language is
Turing Equivalent, there is no need for a particular language feature.
Not all language features are just syntactic sugar. True syntactic sugar features are
local features: you can replace the feature with some other equivalent code without having to change a bunch of stuff elsewhere.
Lazy evaluation and garbage collected memory management are not syntactic sugar: they require wholesale changes to the underlying model of computation to work. The abbreviated
loop in Java 1.5
is syntactic sugar: you can translate each loop into the equivalent old-style iterator loop without any additional support. For that matter, Java
are also syntactic sugar, they’re a way to write the Type Safe Enum idiom with less boilerplate.
Okay, non-local features are not syntactic sugar. When is a local feature “just” syntactic sugar and when is it something more than that?
Let’s compare these two language features. Consider this Smalltalk code:
position: 80@80;
extent: 320@90;
backcolor: Color blue;
caption: 'My Blue Test Window'.
This shows a series of “cascading messages” to the same receiver. It saves you having to type the word “window” again, and it is a lot easier to read, because it lets you
group messages that obviously belong together.
And earlier, we saw:
"Hello, Example. Today's date and time is #{Time.now}."
This is
String Interpolation.
2 It’s an “abbreviation” for a longer sequence of
onto a
The difference between these two trivial cases is that the first example doesn’t change your mental model of what’s going on when you read the code: you’re simply sending a bunch of messages to the

The Ruby Way
is the perfect second Ruby book for serious programmers. The Ruby Way contains more than four hundred examples explaining how to do everything from distribute Ruby with Rinda to dynamic programming techniques just like these.
The second example is different in a very important way: honestly, when you read the second example, do you think “
Aha! Start with a string, get Time.now(), append it to that string, then append a period”?
No way! You think “
A String with Time.now() stuck in it.”
That’s a huge difference mentally, it’s not just shorter, it’s
semantically closer to your mental model of the result you’re trying to achieve.
Wrapping up code that resembles its resultIn summary, one way to write code that is comprehensible is to make sure that the form of the code matches the data the code generates. This is a very general principle, it can be found in web templates (like PHP and ASP pages), markup builder libraries, and even String or List Interpolation.
Features that support this style of writing code are more than simple syntactic sugar, because they alter the reader’s mental model, lowering the semantic distance between the code and the code’s result.
Bonus! Order now, and we’ll throw in these free Domain Specific Languages!We saw that organizing code so that it resembles the result it generates lowers the semantic distance between the code and the solution. We saw two ways to do this: we can use templates or interpolation (if our language permits interpolation) to produce data, and where templates won’t work we can structure our code to resemble its result.
Well, we needn’t stop there.
Domain-specific languages can provide this exact benefit.
The general purpose of a DSL is to write programs, or parts of programs, where the form of the code matches the mental model of domain experts or programmers. And here is one specific use for a DSL: to write code that closely matches the result it generates.
List Comprehensions model lists after mathematical notation.
list { [x, y, x * y] }.given(:x => 1..12, :y => 1..12)
directly describes a list of multiplicands and results.
(19|20)\d\d[- /.](0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])
is a regular expression that matches dates.
3 Do you think it is hard to read? What would happen if you “compile” that expression into procedural code with side effects? Which program would be easier to understand, debug, and modify?
There are many valuable uses for DSLs. One of them is to create programs that closely resemble the results achieve, such as the HTML builder we see earlier, list comprehensions, and regular expressions. DSLs can increase human comprehension by representing the desired result directly.
And if you call in the next fifteen minutes, you’ll membership in the Pattern Matching family at no extra cost!There’s another significant opportunity for writing code that increases human comprehension. We saw how to write code where the form of the code resembles its result. You can also write code where the form of the code resembles
the data it consumes.
In ML, patterns allow us to make our functions resemble the different values they consume:
fun factorial 0 = 1
| factorial n = n * factorial (n - 1)
If you are not familiar with pattern matching, and especially with how languages like ML and Haskell combine patterns with their type checking system, maybe today is the day to spend a little time looking into this powerful idea for making comprehensible programs.
- In my personal experience, “favouring human comprehension” does not mean favouring readability over writability—in order to write a program that solves a problem, I have to understand the solution, so comprehensibility applies to the act of composing and of reading programs by humans.
Now about machines: In this day and age, “The convenience of the machine” is often a way of saying, “the convenience of the layer of abstraction just below your program.” In a sense, every layer of abstraction above the silicon is a kind of virtual machine.
“Remember, it’s all software, it just depends on when you crystallize it.”—Alan Kay, as quoted by Andy Hertzfeld
- List Interpolation actually predates String Interpolation, but most people recognize String Interpolation. Lispers have a little thing called a quasiquote or backquote that builds lists or vectors in a template form.
- Early feedback suggested this is a poor example of a regular expression, because it looks obtuse. I could have selected something much simpler, however I wanted something that really would be incredibly obtuse if you tried to code it procedurally. (Not counting using built-in library functions for parsing dates, of course).
The point is that this regex is readable, and you can see out all of the special cases, right where they belong in their place in the pattern. If readers can post some imperative code that does the same thing in a more readable form, that would be a very interesting lesson.
Labels: popular