Golf is a good program spoiled
A reader mentioned that “It’s not about lines of code.” And he was right: Brevity alone is an unreliable way to judge a program. The problem is—quite simply—the existence of
Golf. In any Turing Equivalent language, it is possible to construct programs that are very small and very difficult to understand.
Brevity is not our goal, but it
is a side-effect of what happens when we write software well. These are my thoughts about the relationship between program size and readability. It’s my take on how you can look at one short program and dismiss it as Golf, and look at another and praise it as being succinct and expressive.
what really mattersSo what really matters? What is it about well-written software that makes it succinct and elegant? Let’s look at an absurdly trivial snippet of code and talk about five different ways we could express its intent:
a = b + c * 2
d = e + f * 2
g = h + j * 2
k = m + n * 2
p = q + r * 2
Hmm. Thirty-five symbols. What about this:
def addtimestwo(x,y)
x + y * 2
end
# ...
a = addtimestwo(b,c)
d = addtimestwo(e,f)
g = addtimestwo(e,f)
k = addtimestwo(m,n)
p = addtimestwo(q,r)
Twenty-eight symbols. Hey, we can also do this:
def addtimestwo(x,y)
x + y * 2
end
# ...
a, d, g, k, p = addtimestwo(b,c), addtimestwo(e,f), addtimestwo(e,f), addtimestwo(m,n), addtimestwo(q,r)
The same twenty-eight symbols. That leads us to:
a, d, g, k, p = *[[b,c],[e,f],[h,j],[m,n],[q,r]].map { |x,y| x + y * 2 }
Twenty-five symbols. Which reminds me:
a, d, g, k, p = *[[b,e,h,m,q],[c,f,j,n,r]].zip.map { |x,y| x + y * 2 }
Twenty-six symbols, but fewer arrays. Comparing them, is the last one less readable than the first? Less “intuitive”?
Let’s get that word out of the way immediately. “Intuitive” is not some wonderful principle that guides us to creating great software. It means no more and no less than “familiar.”
There’s value in doing the same things the same ways, but only when it’s the right thing to do in the first place. Learn your tools.
So which of these trivial examples is better? And what does that tell us about program size? Let’s look at the examples and what they tell us. The first example tells us that there is no relationship between the various expressions, none. The second, where we have ‘abstracted out’ the
addtimestwo
method, tells us that it is no coïncidence that the five expressions used the same formula: they are the same thing.
def addtimestwo(x,y)
x + y * 2
end
a = addtimestwo(b,c)
d = addtimestwo(e,f)
# ...
If that calculation was a piece of business logic, we might want to ensure that if you change one, the others all change as well. In the second example, they are on separate lines, suggesting that although the calculation is shared, the individual assignments are still decoupled from one another. It suggests to me that you might want to remove one, or move it elsewhere.
In the third example, we have moved them all onto the same line.
a, d, g, k, p = addtimestwo(b,c), addtimestwo(e,f), addtimestwo(e,f), addtimestwo(m,n), addtimestwo(q,r)
In some languages, we might have moved all of the assignments into their own block. By grouping them, we have indicated very strongly that the five assignments belong together. In an imperative language like Ruby, this
signals that there is something ‘atomic’ about this, and that you don’t want anything else to happen until all five assignments have been mode.
a, d, g, k, p = *[[b,c],[e,f],[h,j],[m,n],[q,r]].map { |x,y| x + y * 2 }
a, d, g, k, p = *[[b,e,h,m,q],[c,f,j,n,r]].zip.map { |x,y| x + y * 2 }
The fourth and fifth examples are special. The fourth example says that the inputs to these calculations is already arranged in a particular structure. The fifth example says that they are arranged in a different structure, but we will use a particular
unfold—zip—to get them into the form we want to do the calculating.
detour aheadThis is exactly where expressive languages shine, and simultaneously it is exactly where you can go badly wrong in a quest to achieve size for size’s sake. This is a trivial example, but please extrapolate the principle upwards to lambdas, methods, objects, hierarchies, modules, frameworks, and applications.
The Seasoned Schemer is the sequel to the phenomenal book The Little Schemer, a book that teaches recursion and first-class functions. The Seasoned Schemer dives into first-class functions and teaches you how to express relationships by composing and combining functions. It’s a powerful abstraction that every programmer should have in their toolbox.
The fourth and fifth examples are shorter than the first and second, but they are much worse if what they imply—a relationship between the inputs to the calculations—is contrived solely to shorten the code.
On the other hand, if that relationship actually does exist, then they are far, far superior to the other examples. If there
is a relationship there, the most important thing, the clearest thing, is to express the relationship,
to make the code we write clearly communicate the semantics of the data it manipulates.
no free lunchLet’s look at example two again:
def addtimestwo(x,y)
x + y * 2
end
a = addtimestwo(b,c)
d = addtimestwo(e,f)
# ...
Many people feel that regardless of the expressive power of their language, if it supports procedure calls, they’re good to go: they can abstract things into procedures, and bundle procedures into libraries, and with good design and decoupling and muttering yaddda yadda yadda while waving their hands just right, the result will be manageable, readable, abstracted code.
Ok, I’m pulling their legs. It isn’t wrong, it’s absolutely correct that you can get an enormous win out of abstracting procedures, just like example two. But although that is good, having other ways to organize code is
better. Example two adds some overhead: you have to read
addtimestwo
and control-click the label to jump to the implementation (alert to IDE lovers: I do this kind of thing in Ruby as well, please don’t assume that the rest of us are scrabbling with Notepad.exe).
This abstraction adds overhead. “Abstracting” the common operation has made it more difficult to read, not less difficult to read. People for who consider meta-programming some sort of Black Magic often make this exact point: The mechanism for removing duplication adds complexity itself. One view is that the overall effect is only a win if the complexity added is small compared to the duplication removed.
Another view, the one I am promoting here, is that it isn’t about removing symbols, it’s about communicating something about the underlying relationships. Even if your language is so crufty that example two is longer than example one, it’s the right thing to do if the calculations are the same for a deep reason, if you are trying to communicate that they will always be the same.
That being said, my experience is that when a relationship exists, the code that clearly expresses it usually winds up being shorter than code that does not express it.
relevance in the age of blubProgramming languages teach you not to want what they cannot provide.
I said usually. And here is the pitch for more expressive languages. Languages make expressing certain things easier than others. Each language has its strengths and weaknesses in these areas. Have you noticed how many blog posts have been written explaining
Monads?
The problem with languages is that if you are diligent, you will carefully organize your code to express all of the relationships
that your language supports. So in Pascal, you factor out all of the common functions and procedures. And when you are done, you have maximally “compressed” your code. Not in the sense of creating an unreadable mess, but you have made your code as small and as elegant as you think is possible and prudent.
But there are two problems. Pick one:
- You do know other languages, and you recognize the other relationships that Pascal does not express well. So you Greenspun OO into your code base, and you’re hacking away at first-class functions. Now you can express these other relationships, but your code base is extremely idiosyncratic, and in many cases the code is longer and more convoluted because you have to do weird backflips to make the language try to pretend to be Java or Factor.
- You don’t know any other languages, so you think that you have done your best. You have no idea why other programmers are mumbling about Objects, or First-Class Functions, these weird things merely make the code an unreadable mess. You don’t see the underlying relationships those techniques express because you are a Pascal programmer.
Either way, Pascal places limits on how well you can express the underlying relationships between things in your programs. The difference between the two situations is the difference between blissfully moving dirt because “that’s what we do, we move dirt,” and swearing under your breath all day as you fight a losing battle against dirt taking over your construction site.
An “X programmer”, for any value of X, is a weak player. You have to cross-train to be a decent athlete these days. Programmers need to be fluent in multiple languages with fundamentally different “character” before they can make truly informed design decisions.
Let’s look at an
oft-quoted example from a fictional language. We’ll call the language
Blub:
float frubbish = 0.0;
for (int i = 0; i < foo.length; ++i) {
for (int j = 0; j < bar.length; ++j) {
for (int k = 0; k < bash.length; ++k) {
frubbish += some_function(foo[i], bar[k], bash[j]); # updated
}
}
}
This is so wrong in so many ways, but let’s start with the obvious bug. If are a Junior Blub Programmer, you believe that this algorithm expresses the underlying relationship properly, so you just fix the bug. If you are a Senior Blub Programmer, you don’t just fix the bug, but you add meaningful variable names so that
wrong code looks wrong:
float frubbish = 0.0;
for (int index_foo = 0; index_foo < foo.length; ++index_foo) {
for (int index_bar = 0; index_bar < bar.length; ++index_bar) {
for (int index_bash = 0; index_bash < bash.length; ++index_bash) {
frubbish += some_function(foo[index_foo], bar[index_bar], bash[index_bash]);
}
}
}
Ah, but if you are a Blub
Architect, you realize that asking programmers to use meaningful variable names to keep out of trouble won’t work. Why, if that’s all it took, we could use dynamically typed languages like Javascript! No, we must put the compiler to work so that it is impossible to make this mistake. In fact, that’s exactly how Ada works: There is a special type for “the index of a foo,” and it is different from the type for “the index of a bar,” so the buggy code wouldn’t even have compiled. You can do this with C++ as well, and let’s pretend you can do it with Blub:
float frubbish = 0.0;
for (IndexFoo index_foo = new IndexFoo(0); index_foo < foo.length; ++index_foo) {
for (IndexBar index_bar = new IndexBar(0); index_bar < bar.length; ++index_bar) {
for (IndexBash index_bash = new IndexBash(0); index_bash < bash.length; ++index_bash) {
frubbish += some_function(foo[index_foo], bar[index_bar], bash[index_bash]);
}
}
}
The Junior, the Senior, and the Architect are all thinking within the limitations of Blub. They still want to write the same program, they don’t see that the index variable, the test for the end of the loop, the nesting of loops, and incrementing an index have nothing to do with the basic relationship you are trying to express.
Instead of asking themselves
why they have index variables, (or better still, why they have nested loops), the Blub programmers are asking themselves how to minimize the problems index variables create.
People tend to view code bases much the way construction workers view dirt: they want great big machines that can move the dirt this way and that. There’s conservation of dirt at work: you can’t compress dirt, not much, so their solution set consists of various ways of shoveling the dirt around…
Index variables are dirt, and being a better Blub programmer is an exercise in learning more sophisticated ways to handle dirt. The Java Programmer has been liberated from this kind of thinking. She writes:
float frubbish = 0.0;
for (float each_foo: foo) {
for (float each_bar: bar) {
for (float each_bash: bash) {
frubbish += some_function(each_foo, each_bar, each_bash);
}
}
}
In Beautiful Code, great programmers reveal how they found unusual, carefully designed, and beautiful solutions to high-profile projects like regular expression engines, distributed computation, software transactional memory, hygienic macro processing, and twenty-one others.
The authors of Beautiful Code share their thinking and problem solving process with us. And that’s why this book transcends so many other books about programming: thinking and problem solving approaches are universal. This isn’t a book about programs, it’s a book about programming, and with every chapter you read you will become a better programmer.
She is not concerned with ways to prevent mistakes with index variables, because Java gives her a tool to make them go away. While the Blub programmers are arguing about how to manage dirt, she is writing code that expresses the relationship she wishes to communicate.
Her code is shorter, yes. And much better. And the reason it is much better, the desirable property of her code, is that it communicates the actual thing going on and does not communicate a bunch of other things that are irrelevant.
Those extra moving parts, the
yellow code, the accidental complexity makes the code harder to read. It adds opportunities for errors. It makes it harder to maintain. And all because it contains dirt, it contains things that do not express some fundamental relationship in the data.
Shorter code is more readable when it is shorter by dint of expressing the underlying relationship, without irrelevant details.
Alas, while everyone agrees with this statement, the tyranny of programming in Blub is that the Blub programmer does not know that his code contain irrelevant details: he thinks his code does express all of the necessary ideas in his algorithm, he considers shorter code “cryptic.”
thank god, his typewriter ribbon is running dryNow we can see that although each language provides abstraction mechanisms, and lets you build new abstractions with the mechanism, new
kinds of abstractions give us new ways to express relationships. These things can be abused, of course, but nothing can save you from this: If you don’t let your Architect play with Domain-Specific Languages, what is to stop them from
configuring everything in your application with XML?
The goal is readable code that expresses the underlying relationships. When we are programming in our personal Blub, the only relationships we see are the ones our language affords. Thus, we are always tempted to think that are programs already are as readable as the underlying problem domain allows, and that efforts to make them shorter are circus tricks that make the code less readable.
And the only way to break out of Blub is to cross-train, to really
learn new languages. Otherwise, how will you know if your code could be shorter in a good way?
Shorter code
is better if it expresses underlying relationships, if it is shorter because it does not contain irrelevant cruft, if it does not contain workarounds and hacks to deal with limitations in the programming language. Of course, shorter code is not
always better. If it is “shorter by coïncidence,” by exploiting language tricks that do not reflect underlying relationships in your programs, that is bad.
Golf is a good program spoiled.