The Narcissism of Small Code Differences
def formatted_zip_code(digits)
case digits.size
when 4 then "0#{digits}"
when 3 then "00#{digits}"
else digits
end
end
This code reads pretty much as the author would have explained how to handle the “digits” variable: If it only has four digits, pre-pend it with a zero. If it only has three digits, prepend it with two zeros. Simple and to the point and move on. Save the clever code for something that matters.
The Ruby Way is the perfect second Ruby book for serious programmers. The Ruby Way contains more than four hundred examples explaining how to do everything from distributing Ruby to functional programming techniques like the Y combinator.
The author of this code was an
Agnostic. He didn’t mind using a case statement instead of if-elsif-elsif, he paid attention to the features of his chosen programming language, but he didn’t have strong convictions about there being One True Way to express every set of choices.
His code “Just Worked.”
Donning The Hair ShirtThe Agnostic’s code carried on, quietly working, until one day another programmer chanced upon it. Our second programmer was an
Ascetic. Ascetics believe in code that can do powerful things with a small set of core axioms, like recursion and first-class functions.
The Ascetic spotted a bug: What if “digits” only had
two digits? The objective of the code was clearly to have at least five digits, with leading zeros. The Ascetic cracked his knuckles and went to work.
Given his
favourite library function:
def y(&f)
lambda { |x| x[x] } [
lambda { |yf| lambda { |*args| f[yf[yf]][*args] } } ]
end
He rewrote the Agnostic’s code as:
def formatted_zip_code(digits)
y { |rec|
lambda { |str|
str =~ /....(.)+/ && str || rec["0#{str}"] } }[digits]
end
And all was well for a short time.
Toiling in the StacksThe Ascetic checked his code in, where another programmer, a
Librarian, reviewed it. She saw immediately that it could and should be re-written:
def formatted_zip_code(digits)
digits.rjust(5,'0')
end
Librarians believe that there is a library for everything, and good languages have huge libraries providing lots of built-in functionality. Programming is the art of searching the libraries for the one magic incantation that does exactly what you want.
And for a short time, all continued to be well.
Purity of EssenceOne day, management hired a senior programmer with much experience working in the bowels of BigCo. He was brought in to provide “Adult Supervision” for the programming team. He went right to work, reviewing the code base and compiling a list of sins against his religious convictions. He was an
OO Purist.
I am told that a cathedral built by Purists is beautiful to behold, constructed lovingly of small bricks, no more than 50 lines per class, no more than 10 classes per package, no more than two instance variables per class, no more than one level of indentation per method, no more than one dot per line, and absolutely no else clauses.
The Purist considered ifs, cases, and even shortcut booleans to be deficiencies in code, places where the proper approach—and indeed to only approach—is to use polymorphism and
dependent types.
The Purist seized upon the twice-rewritten code as an opportunity to show how such a thing ought to be written:
def formatted_zip_code(digits)
ZipCodeDigitFactory.new(digits).to_formatted_string
end
Alas, space does not permit me to show you all of the subclasses of Digits he composed, one for each possible length of string. For example:
class FourZipCodeDigits < BaseZipCodeDigits
def to_formatted_string
"0#{self.to_s}"
end
end
I have also omitted his ZipCodeFactory that would take a normal string and figure out which particular subclass of BaseZipCodeDigits to construct.
(I am told that a cathedral built by Purists is beautiful to behold, constructed lovingly of small bricks, no more than 50 lines per class, no more than 10 classes per package, no more than two instance variables per class, no more than one level of indentation per method, no more than one dot per line, and absolutely no else clauses.)
Of course, the Librarian, Ascetic and the Purist hated each other with a passion fueled by the
narcissism of small differences into a raging fire of religious cleansing. Much politicking and fighting broke out over the code.
Meanwhile, the Agnostic carried on, quietly coding while the others bickered over how to properly rewrite what he would write. One day, he noticed that some unit tests were failing and raised the issue in Campfire.
The Return of the Agnostic“Hey,” he asked, “What happened to that piece of code? It was for zip codes, it was only supposed to pre-pend one or two zeros when importing zip codes from CSV files. Anything with fewer than three digits is supposed to be an invalid code. Empty strings shouldn’t be converted to five zeros, as far as I know.”
“What’s up with that?”
The Librarian, Ascetic, and Purist were silent for a moment, and then immediately resumed arguing, this time over whether #rjust should have another parameter, whether the regular expression in a recursive routine should be changed, or whether the simplest thing was to simply change the method for the EmptyDigits, OneDigit, and TwoDigits classes.
The Agnostic listened for a moment, tried to interject a few times, then sighed and reached for his text editor.
Update: Reviewing the comments made elsewhere, I see that this post fell into the Fizzbuzz Trap: By quoting a programming problem—no matter how banal and contrived—the article was bound to provoke a huge amount of dicussion around the correct way to solve the problem, while ignoring the point of the post. This is my fault, I should have known better than to post a snippet of code. As many of you have noted, the post is not about the Agnostic or his code, it’s about the dynamic of programmers eager to rewrite code in their own image, and the hypothesis that our (I am equally guilty of this behaviour) motivation for doing so is to emphasize the small differences between ourselves and others.
For no matter how good or bad the Agnostic’s code is, why did the Librarian rewrite the Ascetic’s code? Why did the Purist rewrite it again? When they discovered that the Agnostic’s code actually met some requirement, why weren’t
they talking about documentation, or the process around tests? Why were they all—including the agnostic—eager to rewrite it one more time?
Seriously, the post is not about whether the Agnostic should have documented his code, or where the zip code validation should live. However, I should have known that the moment I included a piece of code to provide local colour, the resulting debate over its fitness was inevitable.