raganwald
Thursday, February 28, 2008
  Where the rubber meets the rock
I’d like to make the following observation: Chris Sharma is a talented rock climber. He climbs 5.13 (5.13 is hard!) in his bare feet, while I only climb 5.11 in technical rock shoes. Do we therefore infer that the shoes don’t matter if you’re a great climber? Well, it seems that Chris climbs 5.15 in technical rock shoes (5.15 is even harder than the climb in the video). Maybe, just maybe, shoes won’t make you a great climber but they will make you a better climber.



This weblog is founded on the belief that the single most important thing you can do to write better programs is to be a better programmer. Wasabi cannot cure rotten fish! However, it is a false dichotomy to assert that since the programmer matters, the tools do not matter.

I understand why people buy into the idea that there is a dichotomy. There is a huge industry promoting the silver bullet that there are tools out there that allow inexperienced or apathetic programmers to write good programs. The “industry” spends billions of dollars a year championing the idea that programmers don’t matter, and it is natural to push back against this and thump your fist, to climb on top of your soap box and tell the world to stop paying attention to the tool and start paying attention to the programmer.

But let’s remember the baby before we throw out the bath water. Tools do matter, and they matter in a way that is crucial to software development: better tools make you a better programmer.

The whole point of better tools is that they change the way you write programs. I am not talking about auto-completion here (not that there’s anything wrong with that, but still). Instead, I am talking about things like Parser Combinators, Continuations, Abstractions, Monads, Metaprogramming, Higher-Order Functions, and everything else that lets you build programs out of entirely new pieces, not just putting the old pieces together a little faster.

Tools matter. Not at the expense of believing that programmers matter. Tools matter because we believe programmers matter.
 

Wednesday, February 27, 2008
  Slightly off-topic: What was that link again?
If you read raganwald via RSS feed, you receive a few links every day.

Those links are all collected in this del.icio.us tag. So you can browse through them if there’s something you would like to see again.

And for the really clever sorts, there’s an RSS feed just for the links and (New!) another just for the articles. I believe you get them in real time and not as a daily summary.

ttfn unless you want to buy me a coffee :-)
 

Tuesday, February 26, 2008
  So, you think you know Regex-fu?
As I’ve mentioned:

/^1?$|^(11+?)\1+$/
Is a regular expression that matches a non-prime number of ones (which means it can be used to recognize a prime number of ones, obviously). It’s obfuscated and golf-like at the same time. But here’s a challenge that takes Regular Expressions to the next level (possible the next lower level of Hell, but these are the risks we adventurers must face):

If you can provide a regular expression such that it actually matches the string representation of only prime (or only non-prime) integers, that would be pretty sweet. A proof that such a thing could not be created would be equally impressive.
—Sam, Overthinking and Stupid Programmer Tricks

Yes, it would be sweet indeed. Anyone up for the challenge? How hot is your Regex-fu?
 

  Let's Make This Personal, Please
The whole Open Classes thing seems to be turning into another debate along the lines of static vs. dynamic typing or perhaps GOTO vs. Structured Programming. I’m all for a lusty debate about programming languages, it’s a definite interest of mine. But I’m going put out an appeal right now: this time around can we try something different?

In the past, we (as in “we the blog echo chamber” or perhaps “we the programming.reddit.com flame wardens” or even “we the news.ycombinator.com mutual admiration society”) have fallen into the nasty habit of “negotiating for someone else.” Do you know what that is? It’s when you say, “Well, that’s fine with me, I can program Emacs to follow the Python indentation rules but Ralph over there hates being told how to format his source code.” Or perhaps you say, “Test-driven development is fantastic, but most programmers are too busy to write tests.” Not to mention when you say, “Parser combinators are fabulous, you can write a really good parser and eliminate all the accidental complexity, but when we outsource maintenance to India they won’t be able to understand my code.”

Those are all valid concerns for somebody, but how about we let those people speak up for themselves, hunh? And it swings the other way, we can argue the Optimistic View about other people. But whether we’re optimists or pessimists, instead of debating things based on our perspective of their experience, let’s debate things based on our own experiences and our own honest likes and dislikes.

As in—and this is my actual personal experience—“I can appreciate how dynamic typing—as exemplified by modifying open classes at run time—makes a program confusing. I feel the same way when I look at a Java program using extensive Dependency Injection driven by external XML configuration files, I have to pick my way through everything to figure out what is really going on at run time.” Note well the use of the word “I” and tying my observation to my actual experience.

That is a very, very different thing than arguing about what might happen to someone else under different circumstances than the ones we live within. Not that there aren’t important things happening elsewhere: You and I are just two people on a planet of billions. But quite honestly, we can go crazy hypothesizing what other people will think and do. I have my hands full just figuring out what’s going on with my code on my projects.

So let’s keep this personal, please. Let’s talk about our actual experience, please.

Thank you.
 

Monday, February 25, 2008
  (1..100).inject(&:+)
I want to show you my favourite line of ruby code:1, 2

(1..100).inject(&:+)
This code works out of the box in Ruby 1.9. This also works in Ruby 1.8 with Rails, thanks to this chunk of code called “Symbol#to_proc”:

class Symbol
# Turns the symbol into a simple proc, which is especially useful for enumerations.
def to_proc
Proc.new { |*args| args.shift.__send__(self, *args) }
end
end
Now, I should explain what I like about it. No, we aren’t going to debate whether { |acc, n| acc + n } contains accidental complexity. We are not going to talk about why this is better than whatever you would do in PHP or Visual Basic to sum the integers. Let go of any idea that this article is about golf. I like this line of code because it says a lot about how the Ruby language works. I do not want to talk about those other things, they are not interesting to either of us.

The two things I do want to talk about are that: 1. Somebody made Symbol#to_proc in Ruby 1.8, and that 2. It became part of Ruby 1.9. Let’s start with how it works.

Symbol#to_proc works because Symbol is an Open Class in Ruby. And Ruby is Dynamically Typed, meaning you can modify classes and objects at run time, thus Symbol gets the to_proc method added when you run the above code. (You sometimes see the expression “Dynamic Typing” used to describe a system where you don’t need to write the type of an object before using it. I call that “Latent Typing” and use the word “Dynamic” to describe a system where types can change thanks to adding or removing methods. In this case, we are adding a method to every Symbol in a program, which is pretty dynamic.)

The Acid Test

One of the things that really intrigues me this little snippet of code, (1..100).inject(&:+), is that you can tell an awful lot about someone’s attitude towards programming styles, languages, and cultural bias by asking them what they think of it. It touches on functional programming, on brevity, and in this case, on open classes.

Take a moment to really think this through. What do you think of the fact that someone sat down and modified one of Ruby’s most core classes, Symbol?

This is a decidedly non-trivial thing to do. Changing Symbol in any way opens up a Pandora’s box of risk. You can break a lot of code. You can—I beg your pardon, you will—confuse people who have never heard of Symbol#to_proc. Just so you can express “Fold the numbers from one to one hundred using addition” in a very direct manner eliminating the boilerplate of { |a, n| a + n }.

If Symbol wasn’t an open class, how would inject(&:+) have been added to the language? Open classes and other “turtles all the way down” features are dangerous. But dangerous features help a language evolve. Lisp’s macros are extremely dangerous. And they are also what made it possible to discover/invent CLOS. Had there been no macros, the only way to experiment with new ideas in OO would be to create whole new language, an exercise in Accidental Complexity.




The Ruby Way is the perfect second Ruby book for serious programmers. The Ruby Way contains more than four hundred examples explaining how to do everything from distribute Ruby with Rinda to functional programming techniques.

When you look at a language like Java, you see this ponderous, bureaucratic process for language change. Java does not include dangerous features like macros or open classes, and as a result change can only come from the language’s implementation, not from out on the fringe where people are using it and creating their own features to address their own needs.

(As someone pointed out, not all languages with centralized control evolve as slowly as Java. But they still are centralized, they still impose a vision from the top. The point is about central planning vs. the free market if you to toss another metaphor on the pile.)

Lispers talk about Bottom-Up Programming. Well, dangerous features enable bottom-up language evolution. We discovered we like Symbol#to_proc because it bubbled up from the bottom. Someone invents something. If other people like it, they use it. The word gets around. People improve on it. Eventually it gains acceptance and becomes the de facto way to write code.

This is true in all languages, but languages—like Ruby—that include dangerous features give the fringe a broader latitude to invent new things. Of course, they also break things and they invent stupid things and they get excited and write entire applications by patching core classes instead of writing new classes and commit all sorts of sin.

The number of readers who seem unable to get past the literal syntax-as-solution of (1..100).inject(&:+) and grasp that the interesting bit isn’t the literal utility of that line of code, or that there are others ways to implement the same functionality, or that pet language X has an easier/smaller/cleaner mechanism for summing integers, or that there are “better” lines of “cooler” code, is really quite depressing to me.

Is it really so difficult to work out that the interesting bits are in the mere existence—or not—of that code at all, per the developer’s whim?
—Chris Cummer

But they will also invent Symbol#to_proc. The “marketplace” has voted very strongly in favour of Symbol#to_proc, so much so that it is now in the next revision of the language. Maybe you love Symbol#to_proc, maybe you hate it. But somebody loves it, enough to have created it, and enough other people liked it that you can find implementations everywhere. So it clearly has some appeal. But that isn’t the point. Even if it is a terrible feature, it is a feature invented by the people and for the people, not a feature pushed down from above.

Open classes makes invention possible in real time. No committee. No waiting for the next language revision. If you think of it, you can try it. Does Symbol#to_proc make it worthwhile? How about andand? Why don’t you tell me? I was serious when I asked you what you think of it.

Are you young at heart?

For some, languages are best when they are mature, when there is nothing left to discover, when all of the “best practices” have been laid down in stone. There’s a comforting stability, a knowledge that some hot shot they hire tomorrow will not arrive with a bunch of new techniques you have never heard of. In mature, stable environments the people around the longest have the most authority.

When anybody can invent a new thing, it’s the youngsters who hold the power. If you don’t like the way Ruby handles nil, you can write your own andand. Surprise surprise, so did half a dozen other people, so for the moment you’re going to find that there is no clear standard, no “One obvious way to do it” in Ruby. That kind of thing is painful for a certain type of personality: they want to know “Which way is the best.” Ideally, they want Matz (or as one person commented, Guido) to anoint one of these the “winner” by building it into the core. That way the oldsters get the power back, they have the experience so they have memorized more of the core API.

Invention isn’t its own reward. Creating new stuff for the sake of novelty wears thin pretty quickly. But when you are making up the rules, you get to stack them in your favour. If you create it, it works for your needs. Naturally, there is an argument that the mature, tested heavyweight whatsis is better than your half-baked roll-your-own whatsis. or at least, there is an argument if we are talking about an Enterprise Transactional Data Processing Platform.

But look at Rails. It is much smaller than something like Spring. So it was much better for 37 Signals than Spring, even though they rolled their own. Rolling your own is a lot harder in a language—like Java—that doesn’t let you have the dangerous features. The trade-offs between using the old and inventing the new are different in a language like Ruby. You still want to use what already works most of the time. But there’re a lot more places where inventing something new makes sense.

Ignorance is Strength

Naturally, a lot of people are going to get the trade-off wrong. They’ll reinvent something that already exists. Or they’ll gratuitously use a dangerous feature when a perfectly serviceable safe feature is easier and faster to use. That’s the consequence of giving people a choice, sometimes they choose wrong. Especially if they are young, especially if they don’t know any better.

But guess what? Not knowing any better is what made David create Rails. he didn’t know Ruby doesn’t work for web applications. He didn’t know it’s wrong to create a web framework when there are so many sitting on the shelf waiting for you to use them right out of the box.

When I look at some code like (1..100).inject(&:+), I see the thing in Ruby that made Rails possible. I see all of the good things—and all of the bad things too. In that way, it’s totemic. And in truth, I think you can really decide for yourself what you think of Ruby just from looking at that one snippet of code.

If it scares the bejeezus out of you, if you start thinking of rules to impose on your team (we will use no more than three of the following seven gems on any one project. You will have an Architect approve any modification to one of the following 132 classes…), then Ruby is not for you.

But if you love (1..100).inject(&:+), if it fires up your imagination, if you think that you might write your own gem one day and see if the community likes your idea for improving the language, then I think you will be able to handle the inevitable snafus with a pragmatic shrug of the shoulders (Maybe I shouldn’t have extensively patched the Array class just so that I could steal the { x => y } notation for my pattern-matching DSL). You will evolve your judgement so that you know when to pull out the dangerous features and when to use restraint.

And whether you love open classes or hate them, whether you can think of a way to use them or whether you think there is always a better way, the indisputable truth about them is that Rubyists are using them to evolve the language from the bottom up, to find new ways to do things. Good things, bad things, beautiful things, ugly things… they are all New Things.

And ultimately, that is what this line of code says to me about Ruby. It says that this is a language where the fringe is inventing new things. And to embrace ruby is to embrace the idea of a language being propelled by its user base.

I use the language, so I obviously have some degree of comfort with the idea of a language evolving from the bottom up. But what do you think? It’s your opinion that really matters, not mine.


  1. Explanation: (1..100) creates a Range. For our purposes, this is equivalent to a collection of the whole numbers from one to one hundred, inclusive (The major way in which it differs from an array for us is that it doesn’t actually require one hundred spots in memory. Ranges are also useful for matching.) The #inject method is called fold or reduce in other languages. Expressed like this, it applies the “+” function to the elements of our range, giving us the sum. Primary school students could tell you an easier way to obtain the sum, of course, but we will ignore that for now.

    Ruby doesn’t actually have a function called +. What actually happens is that #inject wants a block. The & operator converts Proc objects into blocks and block into Proc objects. In this case, it tries to convert the symbol :+ into a block. The conversion uses Ruby’s built-in coercion mechanism. That mechanism checks to see whether we have a Proc object. If not, it sends the #to_proc method to the argument to make a Proc. If the Symbol :+ has a #to_proc method, it will be called. In Ruby 1.9 it has a #to_proc method. That method returns a Proc that takes its first argument and sends the + method to it along with any other arguments that may be present.

    So, &:+ really means { |x, y| x + y }. And the whole thing gives us a simple sum, just as (1..100).inject(&:*) would give us the product. Now that you know that, what does (1..100).map(&:to_s).map(&:size) do?

    [back]
  2. My favourite? Really?

    Actually, yes. Not the most powerful. Not the most compelling. Not even the most concise, and I don’t defend it as being superior to (1..100).inject { |acc, n| acc + n } in any way. And not my favourite line of code from any language: Ruby isn’t my favourite programming language, so it would be surprising if this were my all-time favourite line of code.

    But it is—to me—a totemic line of Ruby code. A line that demonstrates what a lot of Ruby is all about as a language. Its good and its bad. People have written to say that you coul duse sum or whatsis or jeebus to do the same thing in Ruby or in other languages, so this is a bad example. No it’s not! Trying to write an unassailable line of code that everyone will worship for its perfection is a fool’s errand. The point is to demonstrate something of Ruby’s character, especially in its evolution.

    As someone wisely pointed out, the equivalent code in other languages needn’t sum the numbers from one to one hundred, it needs to demonstrate something about that language’s culture and evolution. Java might show something with generics and type erasure. C# might do something with LINQ.

    [back]
 

Friday, February 22, 2008
  The recursive implementation of /bin/true
Here is the source code to /bin/true in Solaris (via The Daily WTF):

#!/usr/bin/sh
# Copyright (c) 1984, 1986, 1987, 1988, 1989 AT&T
# All Rights Reserved

# THIS IS UNPUBLISHED PROPRIETARY SOURCE CODE OF AT&T
# The copyright notice above does not evidence any
# actual or intended publication of such source code.

#ident "@(#)true.sh 1.6 93/01/11 SMI" /* SVr4.0 1.4 */

/bin/true simply exits successfully, and Solaris implements this as a shell script. There’s a lot to laugh about on a Friday afternoon, but a thought struck me: What exactly is the copyright notice copyrighting? Not any implementation code: This “implementation” of /bin/true doesn’t do anything, so it has no executable code. Not the idea of implementing /bin/true as a shell script without implementation code: you’d need a patent to protect that idea.

I conclude that the copyright notice is protecting… The copyright notice! There is a delicious self-reference there in the style of Gödel, Escher, Bach
 

  The house that Mike built
I’ll tell you, as a contractor, I’m sick of fixing crap. There’s nothing better than building it right the first time. Repairing bad work is full of compromises—you’re dealing with existing structure, with existing mistakes, with other peoples’ choices of inferior materials and bad designs. It’s always more work, and it costs more, to go back and fix what was done wrong in the first place.
—Mike Holmes, The house that Mike built
 

Thursday, February 21, 2008
  The Mouse Trap

In the board game Mouse Trap, players build an elaborate Rube Goldberg machine. Wikipedia explains: The player turns the crank (A) which rotates the gears (B) causing the lever (C) to move and push the stop sign against the shoe (D), which tips the bucket holding the metal ball (E) which rolls down the stairs (F) and into the pipe (G) which leads it to hit the rod held by the hands (H), causing the bowling ball (I) to fall from the top of the rod, roll down the groove (J), fall into and then out of the bottom of the bathtub (K), landing on the diving board (L). The weight of the bowling ball catapults the diver (M) through the air and right into the bucket (N), causing the cage (O) to fall from the top of the post (P) and trap the unsuspecting mouse (i.e. the player who occupies the spot on the board at that time).

Software sometimes suffers from a Mouse Trap Architecture, it becomes a chain of fundamentally incompatible components used for purposes far removed from their core competencies, incomprehensibly connected to each other with brittle technologies. And here is the tale of how one such system came about.

The project was originally designed by a Business Analyst who had been a DBA in her younger days. One of the key requirements of the system was that it be completely flexible: it was designed to accommodate almost any change in business requirements without reprogramming. Her approach to designing a program that would be very nearly Turing-complete was to make a data-driven design: nearly none of the business logic was to be written in code, it was to exist in various tables in the database. Changes to business logic would be accomplished by updating the database.

Of course, the application would just be an empty shell, so the actual business analysis would consist of gathering requirements and generating data for the tables. The program was obviously trivial and could be generated in parallel with the business analysis, delivering a huge time saving over her company’s traditional waterfall model where analysis was completed in its entirety before the first line of code was written.

Delighted with this breakthrough methodology, all parties signed off on a seven figure contract and she started by building a large Excel workbook with the business rules, one sheet per table. The plan was to simply export her workbook to CSVs and import them into the database when they were ready to deploy the finished application. And in the mean time, the customer could sign off on the business rules by inspecting the Excel workbook.

Meanwhile, the trivial job of designing a web application for two hundred internal users plus a substantial public site to handle millions of users with unpredictable peak loads was handed off to the Architect. While her Golden Hammer was the database, his was XML and Java. His first order of business was to whistle up a Visual Basic for Applications script that would export the workbook to XML. From there, he wrote another script that would generate the appropriate configuration files for Struts in Java. His work done, he moved along to another project leaving some impressive presentations that delighted the customer immensely.

Implementation being an obvious slam dunk, the company put a few people on the project more-or-less part time while they completed the final easy stages of other delivery projects. Thanks to the company’s signature up-front analysis and rigid waterfall model, they were confident that customer UAT and delivery into production on other projects would not generate any meaningful bugs or requirements changes, so the resources would have plenty of time for this new project.

The New Guy

But just to be sure, they hired the New Guy (not to be confused with the New Girl). The New Guy had a lot of New Ideas. Generally, his ideas fit into one of two categories: Some were sound but unworkable in the company’s environment, and the others were unsound and still unworkable in the company’s environment. His early days were marked by attempts to hook up his own wifi so he could surf on his shiny new Tablet PC during meetings, attempts to disconnect the loud pager that would interrupt all programming in the cubicle farm whenever the receptionist was paging a salesperson, and attempts to get the project to fix all bugs on completed features before moving on to write new features.

When he saw the design of the system, he immediately grasped its deepest flaw: Changes to business requirements in the Excel workbook could cause problems at run time. For example, what if some business logic in Java was written for a Struts action that vaporized when a business rule was rewritten?

Today, we can sympathize with his obsession. He was deeply discouraged by the company’s insistence that development run at full speed developing new features with the actual business of making the features work deferred to UAT at the end of the project. One developer claimed that she had a working dynamic web page switching back and forth between English and Inuit, but the English version was generated by a JSP backed by a stub class and the Inuit version was actually a static HTML page. Management gave this a green light to be considered “feature complete” after the customer failed to ask any pertinent questions during the Friday-afternoon-after-a-heavy-steak-lunch-and-before-a-long-weekend demonstration

Depressed and quite pessimistic about the team’s ability to orchestrate Java development in parallel with the rapid changes to the workbook, he came up with the solution: a series of XSLT files that would automatically build Java classes to handle the Struts actions defined by the XML that was built by visual basic from the workbook that was written in Excel. Then, any changes that were not properly synchronized with the Java code would cause a compiler error, and the team would be forced to update the Java code immediately instead of hand waving things.

Excel >> VBA >> XML >> XSLT >> Java!

The New Guy ripped his phone out of its socket, ignored all emails, and worked around the clock. When management came looking for him, he hid in vacant conference rooms, feverishly tapping away on his tablet. A few days later, he emerged triumphantly with his working build system. Saving a change to the excel workbook automatically generated the XML, which in turn automatically generated the Java classes and rebuilt the entire application, along with regenerating the database tables in a style what would presage Rails migrations.

He presented this nightmare of dependencies and fragile scripts to management and waited for the response. They had shot down every single one of his ideas without exception, and now he was promoting a daring scheme that owed its existence to the proposition that their management style was craptacular. But he was a man of principle, and was committed to do the right thing in the face of any opposition.

Management wasted no time in responding. Brilliant! He was obviously far too valuable a resource to waste time on implementation. He was promoted to a junior Architect role where he could deliver demonstrations to clients. His very first job? To write a white paper explaining the new technology.

Expecting to get the axe, he was shocked by their warm reception. He had failed to realize that management was indifferent to the idea’s actual development value, but had a keen sense of what played well with clients in Enterprise environments. These companies lived and breathed integration between wildly disparate technologies, many of which didn’t work, had never worked, and never would work.

I suspect this is typical of Mouse Trap architectures everywhere. Built with the best of intentions, they survive for reasons their creators could never have anticipated.

Epilogue

The company adopted the new build scripts immediately and assigned a junior programmer who was working on several other projects to maintain them. Within months they had been dismantled, in no little part because the team hated the idea that every time the business analyst changed the business rules, their Java code that was carefully constructed for a previous version would stop compiling.

The New Guy lasted a few months longer before realizing that his sudden accord with management was illusory, and that nothing really had changed. He has now forsworn his love for static typing and is now wandering a production Ruby code base muttering about software development the way the demented wander the streets muttering about government conspiracies.

All that remains of his work are a few XSL files somewhere, like old game pieces that are rolling around the bottom of a drawer at the cottage hoping that someone will open a bottle of wine and call for a game of Mouse Trap to pass the time.

The story depicted here is 100% true. Any similarities in tone and style to The Daily WTF are intentional.

Update: Never assume that you are alone in your madness.
 

Wednesday, February 20, 2008
  Exception Handling in Software Development
Exceptions to the process should be reserved for exceptional circumstances.

This basic principle came up during a discussion of Crunch Mode. “Crunch Mode” is a hip way to say Death March, only you make it sound like it’s not so bad because it only happens for a short time at the end of a project, and it’s a temporary phenomenon just to tide us over until things return to normal. In other words, Crunch Mode is an exception to the way things are normally done in response to exceptionally tight deadlines.

Or is it?

Exception handling

I have encountered developers who say “I know this isn’t the right way to do things, but this is an exception” to explain why they are not writing unit tests, refactoring problem code, or any of a number of other practices that they assure me are the right way to write software. Yes, they know the overall value of certain practices. And they understand that the rot they are introducing to the code—like technical debt—is dangerous and expensive to carry. Not just for “future generations of maintenance programmers,” but for the current team on the current milestone. But this is an exception, damn it!

My first question is always how often this has happened in the past. Were they doing this last week? Last milestone? Last project? If the answer is yes (and sadly for me, it has been yes far often than no), then my next job is to stop them from using this excuse. If doing things “The wrong way” is common practice on your team, it isn’t exceptional: It’s business as usual. And that means you have to change or ‘fess up, jut your chin out, and boldly proclaim that you don’t need no stinkin’ unit tests.

Exceptional behaviour is certainly possible and might be the right thing to do under exceptional circumstances. But there are two tests for exceptional circumstances you can apply: First, if they happen frequently, they aren’t exceptions: The way you handle them is your business as usual.

Second, if they can be predicted in advance, they aren’t exceptions either.

Managing exceptions with your crystal ball

I know for a fact that stake holders want things that aren’t in the telephone book of requirements. I know for a fact that no matter how many rspecs and unit tests I write, handing my software to someone else will uncover bugs. H*ck, I might discover bugs just deploying my software on another system, and I think it’s an immutable law of software that the more important the person viewing a demo, the more embarrassing the bugs will be.




Critical Chain is an amazing book. The narrative form—a novella detailing a technical project team and their search for a way to manage an uncertain process—highlights the ways that Critical Chain Project Management handles risks and uncertainty and makes it visible where you can manage it in advance.

The section on estimating tasks alone is priceless, and I can tell you that the book gave me a vocabulary and a process for effectively negotiating dates and deliverables with my managers.

I cannot tell you with a straight face that any of these “unplanned things” require exceptional behaviour on my part. I may not know exactly which bug will arise, but I know statistically that bugs happen, that requirements come into focus, and that estimates are just estimates. I do not say to my boss, “I had no idea they wanted a view for this behaviour, I had no idea this other thing wouldn’t work as expected, therefore we need to finish the work without testing.”

I can—and do—expect these things to happen, and therefore I can—and must—plan for them.

Of course, if you fail to manage appropriately, or fail to convince your team that you ought to manage appropriately, then you will find yourself doing things “The wrong way.” But you cannot blame the circumstances, you must blame the choices your team made (or neglected to make) when you first planned your development. You must say, “We chose this course of action with our eyes wide open back on such-and-such a date.”

I’ve been talking about coding here, but I usually think about team management as the goats. When people are burning out in a Death March at the end of the project, am I to believe that the manager “had no choice, the client was unreasonable?” Let me see. How do all of the projects you manage go? Are they all like this? Must I look at the second test, whether you could have predicted that this client would behave like every other client?

Managing projects means either endorsing and advocating things like Death Marches or it means predicting the circumstances that lead to them and making decisions in advance that avoid them.

Captain CrunchExternal Link

Crunch Time can be a conscious decision. But please, show me where you wrote “overtime” in the plan at the start of the project. Show me how you planned to complete the high value, must do well features first, leaving the ones that could be incomplete and buggy for the end when people would be working without sleep.

If you fail to plan for Crunch Time, you are not managing your project properly. For example, the traditional model of project management had the software become “feature complete” first, and then Crunch Time would inevitably occur during a bug fixing phase. The problem with this is that the team was now painted into a corner where you could not manage the process. Sure, you could decide which bugs to fix, and therefore you had some ability to decide which features were more important than others, but you could only manage the buggy features.

If doing things “The wrong way” is common practice on your team, it isn’t exceptional: It’s business as usual.

If something was of lower value but was working properly, you cannot go back and do a buggy job of it in half the time so that the developer would be freed up to fix bugs on something important. Managers knew this, of course, which is why they often whipped developers to complete their work in haste early in the project. Paradoxically, this made it easier to manage priorities in a traditional model.

The agile model addresses the issue of managing Crunch Time by doing high value activities first, so that when Crunch Time comes around, you only have low value items. There is no need to whine that you have no choice because a critical feature needs to be ready on Monday. And if you discover something entirely new and unforeseen to do for Monday, you know that it can leapfrog the other items in the backlog because they aren’t that important.

Does agile always eliminate Crunch Time? No, nothing involving humans and free choice always does what you expect. But it is an example of a model where you predict the “exceptional circumstances” and prepare for them. There are other ways to predict exceptional circumstances and prepare for them, of course. The point is that the ends of milestones and projects are known to be difficult and challenging and you need to have a plan from day zero.

And that is the point of managing software development, whether you are managing your behaviour as a developer, managing a team, managing stake holder expectations, or managing a product: it’s your job to expect and plan for these things so that the only time you are “making exceptions” is when the circumstances are truly exceptional, when rare and unforeseeable opportunities arise.


This essay was provoked (in a good way!) by James Golick’s Crunch Mode Paradox: Turning Superstars Average.
 

Saturday, February 16, 2008
  The Stupidity of Crowds, Part I: The Wisdom of Crowds
In The Wisdom of Crowds, James Surowiecki describes how a large crowd of people can be smarter than any one of the crowd’s members, including its experts.

The dynamics of crowd expertise are vital to the collaborative mechanisms that drive social networking, search engine rankings, and everybody’s favourite whipping boy, social news sites.

If you aren’t familiar with the notion, there are a couple of ideas that you may find counter-intuitive. For one thing, we are conditioned to believe that experts are smarter than the hoi-polloi. And if we want even more cowbell^H^H^H^H^H^H^H expertise, we impanel a committee of experts to pool their expertise.

In Part II, we will look at how crowds go wrong, including why committees are stupider than both their individual experts and the crowd at large. We’ll find out why you should never Ask Reddit for feedback about your software or business idea, and for kicks we’ll revisit the Game of Technology Survivor.

The Wisdom of Crowds

If you want to extract the actual wisdom of a crowd, you need to aggregate each member’s independent sincere opinion.

First, you need each member’s opinion: if your mechanism for sampling the crowd’s wisdom is biased, you will not get a good result. You may think you are getting a representative sample, or that you are only getting the smartest people in the crowd, but you will actually just get a biased sample that is wrong. The “expert’s committee” is flawed because it’s a committee, but also because it is made up of experts.

Take project management, for example. It’s pretty easy to identify experts. So-and-so writes a blog about shipping software. So-and-so has actually shipped a lot of software. So-and-so has a piece of paper certifying expertise in project management. So-and-so is actually managing a project of interest.

And if we are looking at a particular project, if we have experts running the project, all they do all day long is think about what will be ready by when. So you would think that we could get the most accurate information about what will be ready by when if we ask the experts. And that’s how we constantly get software development projects wrong.

An expert is better than the average member of the crowd. Perhaps even better than any other member of the crowd. But that is not the same thing as saying that the expert is better than the entire crowd. There are a number of reasons for this. One of the most important reasons the crowd can outperform its most expert member is that information is not perfectly disseminated. If the expert knew everything, they might be better than the crowd. But they don’t know everything, and neither does any other member of the crowd. You only get the complete picture when you find a way to poll the entire crowd, not just the ones you think are the experts.

This is absolutely the case with project management: that’s why most of our processes for project management revolve around communication, getting individuals to communicate what they know about the project to the experts managing the project. The first key to obtaining the wisdom of the crowd is to harness the wisdom of the entire crowd.

You also need each member’s independent opinion, as we discussed above. If you gather everyone in a room and ask them to vote, you get different results than if you ask them privately. You especially get different results than if each person jealously tries to hide what they think from each other: you get better results when each team member has no idea what the other members think. This sounds antithetical to team spirit and effectiveness: the whole point of communication is for team to tell each other what they know. But if team members are communicating in ‘real time with high bandwidth,’ you aren’t going to get independent opinions, even if you poll them privately.

For that reason, you often have to go beyond a team if you want to know how the team is going. You can’t get independent opinions by asking the team itself.

Finally, you need sincere opinions. This is almost impossible to achieve by asking people. As Greg House says, “Everybody lies.” My experience with software development is that people lie and they don’t even know they are lying! People say what they think the boss wants to hear, or what will get the meeting finished so they can get back to playing SimCity. They especially will say whatever they think is socially acceptable to say, even if they are filling out an anonymous poll.

Before continuing, please give me your feedback:

Long essays...
pollcode.com free polls

Do you see how the results are hidden from you before asking you to answer? That is Statistics 101: If you tell people how other people have voted, you influence the results. This effect is so important, it is often the only thing being debated in a democratic election: which person is “most likely to win.” That’s because the crowd tends to vote for whomever they perceive is going to win any ways, so each candidate struggles mightily to persuade everyone that everyone else is going to vote for them.

As I write this, Hillary Rodham Clinton is claiming that although Barak Obama polls well, that he will lose the presidential election if nominated by the Democrats. Her argument is that although Americans want to be seen as voting for the best person, in actuality they won’t vote for an African-American. Whether true or not, this argument illustrates that people will say what they think they ought to say, even in anonymous privacy.

Extracting the wisdom

Right now, all of the challenge of extracting the wisdom of a crowd goes into finding a way to get the entire crowd to give independent and sincere opinions. And all of the stupidity of crowds arises from mechanisms that bugle one or more of these three points.

Please return for Part II, where we will look at various mechanisms for extracting the expertise of a crowd and see why biased samples, lack of independence, or incentives to lie all create a crowd that is stupider than its members, not smarter.
 

Friday, February 15, 2008
  Raganwald at RubyFringe
I’m really excited about this: RubyFringe is “A single-track indie conference with no paid technical sponsors.” Some of the most interesting people I know are participating, people like Leila Boujnane, Evan Phoenix, Damien Katz, and Obie Fernandez.

RubyFringe is an avant-garde conference for developers that are excited about emerging technologies outside of the Ruby on Rails monoculture.

RubyFringe breaks new ground. It captures the spirit of living on the bleeding edge, the culture of people who are interested in creating the future, not just going along with whatever happens to transpire.

And I will be there. I’ll have more to say about my participation in the next few weeks, but I will commit to this: I plan to launch something wild at RubyFringe.

Registration opens Monday, February 18th. Please do not miss out on this!

RubyFringe
 

  Raganwald at SciBarCamp
I will be attending SciBarCamp at Hart House in Toronto March 15 and 16, 2008. I’m looking forward to sharing ideas with people like Michael Nielson.

SciBarCamp is a gathering of scientists, artists, and technologists for a weekend of talks and discussions. It will take place at Hart House at the University of Toronto on the weekend of March 15-16, with an opening reception on the evening of March 14. The goal is to create connections between science, entrepreneurs and local businesses, and arts and culture.

Did you know that the co-author of “one of the most highly cited physics book of the last 25 years, and one of the ten most highly cited of all time” is a Rubyist? Neither did I, and now you know why I’m excited to mingle with a such a diverse group of interesting and intelligent people.

There are still a few places available. If you will be in Toronto that weekend (or are looking for an excuse to visit), please join us.

See you there!
 

Tuesday, February 12, 2008
  The value of work
Anansi decided he wanted to have some fish. He went to Babatunde the Fisherman and offered to help Babatunde fish that day. Babatunde agreed, and they went to the river.

Babatunde said, “Let us spread the nets together.” Anansi demurred. “That is too inefficient: specialization is the key to efficiency. All week you do the work and you get tired. This is not fair that you have to work hard and your reward is exhaustion. So, you will spread the nets and I will get tired.”

And so Babatunde spread the nets while Anansi laid in the shade of a tree on the river bank, loudly proclaiming how heavy the nets were, how difficult it was to wade through the river current to set them, and how tired and sore he was.




Anansi Goes Fishing tells this timeless tale of West Africa with a completely different twist. It’s a wonderful book to share with children from three to ninety three.
Later, it was time to harvest the catch. Babatunde felt bad for Anansi who was still complaining about being tired, so he suggested that Anansi harvest the fish and he, Babatunde, would suffer the burden of being tired. But no, Anansi would hear nothing of it, so Babatunde harvested the fish and Anansi again was tired. And so it was again that Babatunde gathered the nets while Anansi was tired, and Babatunde carried the fish back to the village while Anansi walked beside him, groaning aloud with the effort.

They reached Babatunde’s house and Babatunde put down the heavy load. Anansi offered to divide the catch, but now Babatunde demurred. “Brother Anansi,” he said, “There is no need for you to work, you have been tired all day. I will make us a splendid dinner and we can discuss the portioning over a delicious meal.”

And Babatunde made a delicious fish stew, with peppers and spices. He put out two bowls, one for Anansi and one for himself. He ladled the hearty meal into his own bowl and began to eat. Anansi asked him, “Brother Babatunde, where is my share?”

Babatunde looked at him in surprise: “I thought that since you were tired all day, you deserved a reward: therefore, I am going to eat the fish and you will feel full.”



Many times you will hear people arguing whether the value is in an idea, or the implementation. Whether it is in the development, or the promotion. Whether it is in the muscle, of the management. Whether it is in the execution, or the imagination.

My suggestion is that if you are the one doing the work, arrange your affairs such that you—and not your partner—decide how much goes into each bowl when the stew is ready.
 

Monday, February 11, 2008
  Blodgett's Law
A development process that involves any amount of tedium will eventually be done poorly or not at all.
Blodgett’s First Law of Software Development

See also: verbose programming languages, boilerplate code, commenting code, and so forth…
 

  The Naive Approach to Hiring People
Every once in a while I read (or write!) something about hiring programmers. What to look for in a résumé. What to put in your résumé. Why _____ is my favourite interview question. Why _____ sucks as an interview question. Whether we need to filter the absolute dreck out.

I’ve even written one of those _____ sucks posts on my blog, and I’m here to tell you, I was wrong. And I’m going to tell you why I was wrong. But first, here is an interesting programming problem-style interview question. I’m not suggesting it is good or bad, for reasons that will become obvious.

An interesting interview question

You have a large collection of documents, each of which accurately describes a single person’s properties. One document, one person. To keep this light, perhaps you are looking for a compatible bridge partner. The documents are online player profiles, and you are interested in finding a suitable partner. The properties for people are multi-valued, there is a large set of properties, and for each property in each document, there is either no value or a selection from a set of values. One value might be number of years of experience, another might be whether they overcall in third position with a weak hand (where “no value” means the other person did not answer that question and “no” means they do not overcall).

This is an iterative problem, you have to perform the separation on a regular basis, perhaps once each month. And each month, there is a new set of documents and persons to classify. Having performed the classification, you can check the game results at your local bridge club and see how everybody did, both the people you selected as potential partners and the people you rejected.

Describe a strategy for picking the best partners based on their profiles.

Before we discuss whether it is a useful problem, let me tell you who I’m interviewing with this hypothetical question: Technical Hiring Managers, specifically people who are technical themselves and are also responsible for hiring other technical people. Part of their job is looking through piles of résumés, picking out the good ones to phone screen.

What I’m looking for in a “correct” answer is a basic understanding of Document Classification. Given that we are talking about programming and programmers, a really good answer will discuss things like Naïve Bayes Classifiers. Like programs that can distinguish Ham from Spam.1

The point is that someone with at least a basic understanding of document classification knows how to apply what we know about document classification to the problem of selecting candidates to phone screen based on their résumés.

Someone with an understanding of document classification knows how to apply what we know about document classification to the problem of selecting questions to ask in phone screens and in face-to-face interviews, not to mention what to do with the answers. (And the emotionally nice thing about this is that it’s an interview question for interviewers to solve.2)

If Statements vs. Classification

A very senior Microsoft developer who moved to Google told me that Google works and thinks at a higher level of abstraction than Microsoft. “Google uses Bayesian filtering the way Microsoft uses the if statement,” he said.
—Joel Spolsky, Microsoft Jet

I could make a reasonable argument that someone who doesn’t think of selecting candidates as a classification problem might miss the fact that the things to look for—years of experience with a specific technology, length of time the most recent position—are merely document features with probabilities attached to them. I could make the argument that they are “thinking in if statements” about hiring programmers.

I could go on about how saying a particular job requires “Five years of JEE” is an if statement, and one that is far from universal. So someone who thinks like that is not a good interviewer, that they really ought to be thinking in terms of the probability that someone with five years of JEE will be Ham and not Spam.

Oh, the irony. I would be arguing that the interesting question is useful because it identifies people who pose questions like that as being bad interviewers!

There are really two approaches to take in selecting candidates. The first is the approach of the if statement: You form a model of what the candidate ought to do, work out what they ought to know in order to do that, and then you work out the questions to ask (or the features to look for) that demonstrate the candidate knows those things. If they know this and this and this and if they don’t have this bad thing or that bad thing, call them in for an interview (or, if you are interviewing them and they have demonstrated their strength, hire).

The second approach is the classifier approach. Each feature you look for, each question you ask, is associated with a probability. You put them all together and you classify them as interview/no interview or hire/no hire with a certain degree of confidence.

So is the classifier the same thing as the if statements, only with percentages instead of boolean logic? Perhaps we could simply make up a score card (10 points for each year of JEE, 15 points if they use JUnit, &c.)? No.

The most important thing about most classifiers is that they can be remarkably naïve and still work. In fact, they often work better when they are naïve. Specifically, they do not attempt to draw a logical connection between the features that best classify candidates and the actual job requirements. Classifiers work by training themselves to recognize the differences that have the greatest statistical relevance to the correct classification.

That’s the naïvité at work: they have no idea that experience in functional programming is irrelevant to a job writing Javascript: they just notice that the people with FP experience tend to do well in Javascript jobs, so they start considering it relevant.

Training day

Document classification systems are trained, typically using supervised learning: “These are the résumés of the good people. These are the résumés of the ones we had to fire.”

Here’s a thought experiment: Pretend you are trying to write a mechanical document classifier. Let’s see if designing a machine to perform the process can identify some opportunities to improve the way humans perform the process. (As a bonus, we might actually identify ways machines could augment the process, but that is not our objective.).

If you were writing a document classifier for résumés, the first thing you would probably write would be a feature that updated the training corpus whenever a programmer completed their initial probation: If their first formal review was positive, their résumé would be added to the “Interview” bin. Otherwise, it would be added to the “No Interview” bin.




Programming Collective Intelligence breaks out of “thinking in if statements” and provides practical examples for building systems that reason based on learning from data and behaviour, such as the Naïve Bayesian Filters discussed in this essay and collaborative filters such as recommendation engines.

This is a big, big difference between approaching hiring people as an exercise in if statements and as an exercise in classification. If you are working with if statements, you only change the if statements when something radical changes in the job or in the pool of people applying for the job.

But if you are approaching hiring people as an exercise in classification, you are constantly training your classifier. In fact, the quality of your results is driven by your process for training, for continuous improvement. It’s a process problem: how do we do a good job of training our classifier and keeping it trained?

Consider the training process I mentioned above: you build a document classifier, and you feed it the résumés of people you hire after they complete probation. If they quit or are fired, they are marked as “No Interview.” If they get a lukewarm review, we mark them as “No Interview.” But if they get a good review, they are marked “Interview.” What do you think?

Ok, thanks for using the comment link to tell me what you think. Here’s what I think: this is dangerously incomplete. Pretend we’re sorting emails into Ham and Spam. Training our résumé classifier based on who we thought was originally worth an interview is like training our email classifier based on which emails ended up in our in box. It totally ignores the good emails that were classified as junk. To classify emails properly, you have to go into your junk mail folder every once in a while and find the one or two good emails that were misclassified as junk, then mark them “not junk.”

Our thought experiment has identified a critical component of classification systems: to train such a system, you have to identify your false negatives, just as junk mail filters let you sort through your junk mail and mark some items not junk.

Where hiring people is concerned, what is the process for checking our junk mail filter? How do we find out whether any of the résumés we passed over belonged to people worth hiring? I don’t have an answer to this question, but thinking of résumé selection as an exercise in document classification identifies it as an obvious weakness in the way most companies handle interviewing: as an industry, we don’t do much to train our selection process.

A metric fuckload of process

A company really obsessed with hiring well would keep statistics. I know, I can feel your discomfort. More paperwork, more process, more forms to fill out. But honestly, every process is improved when you start to measure it. Maybe we measure too many things, or the wrong things. My ex-colleague Peter Holden is a terrific operational manager. His metric for metrics is to ask whether a particular measurement is a management report, meaning—in his operations lingo—is that piece of data used to make an active decision in the business?

For example, if we actually store résumés and also the outcomes—whether we hired them, how they did—and then use that data to constantly improve how we select résumés, then that is a management report and that is data worth collecting.

Likewise, we could ask questions in interviews and actually track who answered correctly and who answered incorrectly and whether the answer had any correlation with a candidate’s eventual job performance. Does that sound like too much work? Seriously? Are you drinking the same kool-aid I’m drinking about the importance of hiring good people and the critical need to avoid bad hires?

The bottom line in my interviewing technique is that smart people can generally tell if they’re talking to other smart people by having a conversation with them on a difficult or highly technical subject, and the interview question is really just a pretext to have a conversation on a difficult subject so that the interviewer’s judgment can form an opinion on whether this is a smart person or not.
—Joel Spolsky, The Phone Screen

Or let’s move up a level. Many people like the touchy-feely voodoo approach to interviewing. Joel Spolsky calls certain questions “a pretext to have a conversation on a difficult subject so that the interviewer’s judgment can form an opinion on whether this is a smart person or not.” So maybe the answer to the question can’t be tracked in a neat yes/no, right/wrong way.

But you know what you can track? How about tracking whether each interviewer is a reliable filter? Do you keep statistics for which interviewers let too much Spam through, for which interviewers are so conservative that they statistically must be turning good people (Hams) away?

No? I must be honest with you. Until now, neither did I. Although I do not speak for Mobile Commons, I’ll bet we will be discussing it soon. We’re serious about growing, we’re serious about hiring really good people, and we don’t want to put on the blinders and demand “Five years of JEE.” Which means we want to talk to a lot of people who are “Smart and Get Things Done.” And which also means we need to get really, really good at bringing good people on board.

Which means we want to ask the questions that actually help us distinguish the best from the not-so-best. Which brings me back to my interesting question above, and why I won’t say whether it’s good or bad. Because I haven’t trained my filter by asking it of a representative sample and then determining the correlation between a supposedly correct answer and actual fitness for the job.

And the only way to know if it is useful is to incorporate it into a classifier and see if it collects a high conditional probability.

Summary

I am not suggesting that naïve Bayesian filters can outperform human interviewers, or that asking fuzzy questions like “How would you design a Monopoly game” have no place in hiring, or that an experienced programmer cannot tell if another person is an experienced programmer by talking to them.

I am especially not suggesting that people do not make false statements: many of the people I have interviewed in my career really believed that working on one Java application for two years made them experienced programmers with strong OO architecture skills.

But as stated clearly above, I am claiming that someone with at least a basic understanding of document classification knows how to apply what we know about document classification to the problem of selecting candidates to phone screen based on their résumés. I am claiming that what we know about training classification systems can be applied to improving the hiring process.

And mostly I am claiming that when we take a single question or feature, like "Years or experience," or perhaps, "Ability to write Fizzbuzz in an interview," the correct way to reason about its applicability to the hiring process is to think of its statistical correlation to our objective, not to try to construct a chain of if statements.

If you find this interesting, Games People Play discusses what to do about the fact that candidates will say or do anything to get a job, including lie about their experience.



An Apology

Remember I told you I was wrong about thinking something sucked?

Did you ever take that test yourself? Deckard?
—Rachel, Blade Runner

Once upon a time, I was asked an interview question, and I gave a very thorough answer, including all of the usual correct answers plus an unusual nuänce, a corner case that most people probably would have skipped. It cost me the job, as it turned out, the interviewer told me I was mistaken. I carried that on my back for years, even though the job probably wasn’t all that great a fit.

But now, I realize that worrying about answering the question correctly is thinking in if statements. If I get it correct, then I must be fit for the job. Not true at all. There could be a classifier question where there is a strong reverse correlation between getting the question correct and confidence in classifying you as “Ham.”3

The only thing that matters about that interviewer is whether, on the whole, he does a good job of separating Ham from Spam. Perhaps he does, in which case I was simply one of those statistical necessities, a false negative. Or equally valid, the question itself may have been highly valid, as was his interpretation of the answer: the only thing that matters might have been that answering in the manner the interviewer expected was highly correlated with job success, and that answering in the manner I did was negatively correlated with job success.

Naïve classifiers are brutal in that way. They don’t work the way you expect them to work. Spam filters give relevance to all sorts of words you wouldn’t expect. Or to phrases you don’t expect (thanks to interesting work with Markov Models). It’s a precise, bloodless process.

It isn’t personal. And for that reason, we really ought to back away from thinking about hiring in if statements. It’s a path that leads right towards taking it personally. As an interviewee, we take questions or puzzles that we find difficult very personally. We get angry if we are asked things we consider irrelevant to the job. Secretly, we want interviewers to validate our worth, not just by saying “Hire,” but by valuing the things we value about ourselves, which means we look for interviewer to have if statements that align with our notions of competence.

And as interviewers, it is difficult to take ourselves out of the equation. If we only hire people just like us, we have no opportunity to learn and improve on our hiring practices. Hiring people unlike ourselves is hard if we hire with if statements. It requires valuing our incompetence instead of our competence.

Approaching the problem as a problem in classification is our road out of that emotional swamp. It’s a process we can explain and understand without being personal, without judging ourselves as people or our candidates.

With this new understanding, I apologize to that interviewer for my criticism of the interview process. I will try to improve my approach to discussing interviewing and interview questions in the future.



  1. There are a lot of classification algorithms, and this essay is not a claim that Naïve Bayes is ideal for any or all hiring purposes. But I use it as an example because most people understand spam filters and roughly how they work. [back]

  2. Although this isn’t the subject of the essay, please feel free to use this question in the following manner: If you find yourself in an interview where the interviewer bombards you with puzzle after puzzle in an effort to impress you with how smart he is, when he folds his arms and asks you if you have any questions for him, pull this one out. Let me know how it goes :-) [back]

  3. Reverse-engineering classifiers can be futile, but one can imagine a question that reveals the person answering it is highly overqualified for a basic clerking job. Or something. [back]
 

Friday, February 08, 2008
  If you want to pass, cheat. If you want to learn, research
On Monday of this week, I began an experiment. I stopped looking up Ruby and Rails information in my PDF of the second edition of Programming Ruby and in the excellent on-line documentation for Rails. Every time my fingers twitched to do a search, I reached for an actual dead-tree book instead. Not one of those quick references designed for looking things up, but a book where the author actually teaches the material.

I have been looking things up using the stone-age technology of a good index. And then I’ve been reading actual explanations, not just copying code snippets off the ‘net and pasting them into my programs.

It’s Friday afternoon and neither you nor I want to slog through a heavy essay, so let’s just flip right through the pages of metaphors and anecdotal examples from my own experience. Imagine for a moment that there were a number of really insightful remarks, a few good quotes from other bloggers or authors, and a couple of great code samples. Since we’re in a hurry, let’s jump right to the end:

So in conclusion, if you just want to get something done quickly without gaining a fuller, deeper understanding, by all means use a cheat sheet. But if you want to actually learn more about what you’re using, the next time you need to look something up, don’t go to the cheat sheet, Don’t Google it. Look it up in an actual book like The Ruby Way or its equally informative companion The Rails Way. It may take a moment longer, and you can’t copy and paste. But a good book contains far more than an explanation of what to do. It shows how it works, when to do that, and why it works that way.

A good book is going to answer a question you didn’t know needed to be asked. It’s tough for a busy programmer to find time to read books from cover to cover and learn the material. But every time you need to look something up, you have a golden opportunity to research the material instead of cheating the answer. Use those opportunities to learn: That knowledge is going to make you a better programmer.
 

Wednesday, February 06, 2008
  Turtles all the way down, please
My other favourite interview question

Someone was once phone-screening me for a job in a start up, and they asked me to name my two strongest languages. At the time, the answer was Java and C++, partly because I had just finished writing Java development tools in C++, and partly because while I loved Scheme and Smalltalk, I knew that anyone using them in day to day production would quickly expose me as a dilettante if I were to answer with my heart instead of my head.

After I said “Java and C++,” the follow-up was interesting. The interviewer asked me: If you could change both languages, name one feature from Java that you would add to C++, and why. And name one feature from C++ you would add to Java, and why.1

Today I would answer that question differently, of course, but I bring it up because there is a feature of Common Lisp, Arc, and C++ that I miss whenever I use a conventional object-oriented language like Smalltalk or Ruby: lvalues and references. In short, what I miss is the ability to redefine the assignment operator.

Or do I? Let’s have a look at what that means and why I miss it.

Defining a = b

In Ruby, you have a limited ability to define the assignment operator. Specifically, you can define methods that look like fields or instance variables:

class Foo
def something= value
# do something
end
def []= index, value
# do something else
end
end

f = Foo.new
f.something = :else
f.[:where] = :what

Of course, that means something entirely different than writing:

something_else = :fizbuzz

Which binds :fizzbuzz to the local variable something_else. But that’s really it. All uses of “=” in Ruby are one of these cases: binding a variable, calling “[]=,” or calling a named method on an object and passing one parameter.

So?

Well, there’s something I would like to do in Ruby that simply cannot be done. If I have an array I make from a range, say (1..100).to_a, how do I double the odd numbers in that array? Obviously, I can write things like

a = (1..100).to_a
...
a.each_with_index do |i,n|
a[i] *= 2 if n % 2 == 0
end

But imperative loops are yucky. I could write it a little functionally:

a = a.map { |n| n % 2 == 0 ? n * 2 : n }

But that creates a new array and overwrites my variable. Besides the fact that it seems wasteful, this is absolutely the wrong solution if the original array is shared elsewhere in my code: I am not changing the original array, I am merely binding a local variable to a different array. Try writing a function called “double_the_even_values!(arr)” (that doubles the values of arr), and you’ll see what I mean.

Really, I want to write:

a.select { |n| n % 2 == 0 } *= 2 # or perhaps:
a.select { |n| n % 2 == 0 }.each { |n| n *= 2 }

But I can’t. If a.something gives me a value and a.something = else sets a value, and if a[where] gives me another value and a[where] = what sets another value, then:

When a.select { … } gives me some values, why doesn’t a.select { … } = stuff set some values?

Why not

There’s a reason it doesn’t work that way. Never mind whether Matz was copying some of Smalltalk’s assignment semantics when he designed this piece of Ruby, let’s talk about if statements and cases. Once upon a time, people wrote programs with lots of explicit branches in their code. If this is the case, do this. If that is the case, do that.

The problem with languages like Ruby and Java is that everything is an object and every action is a message, except when it isn’t.

One of the big ideas in OOP was polymorphism. When you program with interfaces, a lot of the logic of what do do for each case goes into your class hierarchy: that’s what method dispatching is all about. This is great, because when you want to do something new, the whole idea is that you create new classes with interfaces just like your existing classes and they just work.





The Reasoned Schemer takes us into the world of logic programming, a world where we take fairly ordinary logical functions and reason with them, building higher-order control structures like backtracking out of simple functions.

Wait a second. Building higher-order control structures out of simple functions? This sounds suspiciously like working with Monads… The Reasoned Schemer contains a concise and remarkably easy introduction to decoupling control flow issues like backtracking and error handling from program logic issues. Recommended!

Okay, we can stop smoking the good stuff, nothing just works and OOP is not a silver bullet. But the key idea is that by moving some of our logic into our objects, we can write more generalized code elsewhere, code that does not need to know what an object is doing inside itself.

Let’s look at a different paradigm: Higher-order functional programming. In this paradigm, functions are first-class values: you can write functions that take functions as arguments and/or return functions. Again, there is a de-coupling: if you write a compose function that creates a new function out of two or more other functions, it does not need to know what those other functions do, just how to wire them together. Monads are based on this premise as well, on the premise of writing generalized functions that do not need to know the intimate details of the things they consume and produce.

The problem with languages like Ruby and Java is that everything is an object and every action is a message, except when it isn’t. The things that are objects and messages are remarkably flexible and work well. The things that aren’t objects and messages are like case statements in older programs: places where you cannot extend the language without dealing with thousands of these case statements that are all coupled to each other in unexpected ways.

In Ruby’s case, the rules for what it means when you write expression1 = expression2 are hard-coded into the parser like a case statement. You cannot create a select= method that modifies its receiver. (Actually, you can define the method. Good luck calling it without resorting to send.)

Of course, you can write a method to do what we want:

a.update(
:where => lambda { |n| n % 2 == 0 },
:with => lambda { |n| n * 2 }
)

But now we have a serious inelegance: why do three kinds of getting and setting values use one syntax, but the fourth uses a different syntax?

How Common Lisp handles this

Common Lisp has something called a Generalized Variable (also called a generalized reference).2 In OO terms, it’s an object that has a setter and a getter method, although you can call them the access and update operations if you prefer.

Common lisp has a macro, setf that performs an update. In short, if you call:

(setf expr1 expr2)

Common Lisp does something analogous to:

expr1.update( expr2.respond_to?(:access) ? expr2.access() : expr2 )

Let’s pretend that Ruby did this. This would be very interesting! For starters, instead of writing an accessor this way:

class Bar
def something
@something
end
def something=(else)
@something = else
end
end

You would define a single method that returns a generalized variable:

class Bar
def something
GeneralizedVariable.new.me do |gv|
def gv.access
@something
end
def gv.update(else)
@something = else
end
end
end
end

Now whenever you write some_bar.something you would get a generalized variable and Ruby would figure out when you need its value and when you need to assign something to it. Of course, that looks like a lot of boilerplate, but you could rewrite the existing attribute class methods to do this for you. (Actually, this exact code wouldn’t even come close to working because @something would belong to the new object, but squint at it sideways and pretend we can do that.)

You could then write your select method to return either a single generalized variable or a collection of them, depending on how you want your semantics to read. For example, you could choose which of the following ought to work:

a.select { ... } = expr
a.select { ... }.each { |x| x = expr }
a.each { |x| x = expr if ... }

The reason why this is interesting is that the original language designer does not need to worry about whether people want to write select methods that can update their host collections. The language designer merely needs to write things in a sufficiently general way, and people will work out the implications for themselves.

Returning to Ruby (and a lot of other languages), now we see why everything being an object and everything that happens being a method invocation “except when they aren’t” is a problem: Assignment is one of those things that isn’t, and because of that, there are places where our programs aren’t going to be consistent with the language’s philosophy. If assignment was always a method invocation, I suggest that there would be a way to write a select that modifies its receiver.

Of course, there would also be things like this:

2 = 3

Which would now pass blithely though the parser only to raise a NoMethodError at run time because numbers do not implement an update method (At least, not in Ruby. Legend has it that this was legal in early versions of FORTRAN. But those systems were built by Real Programmers who eschewed type safety, quiche, and semiconductor memory).

I think it’s a good thing when languages are highly consistent, in the vein of Scheme, Smalltalk, and Forth. Although you cannot redefine “:=” in Smalltalk for select methods, you can’t redefine it for anything else either, so at least we’re clear that you can never do it, not do it some of the time. Such languages are remarkably consistent, they are quite stable, and although they may be unfamiliar to someone approaching them from a different paradigm, they are highly discoverable.

Turtles, it’s always the damn turtles

Like any program, languages evolve over time as people realize they need new features. A lot of the difference between languages boils down to a basic decision every implementor has to make when deciding how to add a new feature:

Should I add this feature? Or should I make this feature possible?

Adding a feature is like adding a new case to an if statement. You want List Comprehensions? You’ve got it. Syntactic sugar for closures? Done.

Making a new feature possible is the deeper, more challenging option. It involves rewiring the way the language works such that the new feature becomes a special case of a more general language capability. So instead of modifying Ruby to support select=, you add generalized variables, which make select= possible.

Some people will argue with this,3 but I think languages where “It’s turtles all the way down” are better than languages that are exceeding well thought-out and painstakingly chosen collections of features that amount to a heap of if statements and special cases for everything.

Turtles all the way down is a philosophy where you try to define a ridiculously minimal set of orthogonal axioms and build the language by combining those axioms. At the lowest level it feels like pure math: Combinatory Logic can be built out of three, two, or even one combinator if you are careful. Scheme has five special forms. But with the right abstractions and syntactic sugar on top, you can produce a language with an amazingly diverse feeling to it.

But the advantage of having everything in the language built on top of a small set of axioms is that when you want to make a new feature possible, it’s often just a case of burrowing down a level of abstraction and combining the existing things in new ways. For example, it could be as simple as rewriting select to return generalized variables instead of values… if Ruby really treated “=” as a method instead of treating it as a special case that is sometimes a method and sometimes something else and sometimes a third thing.

Of course, you can make new things possible in languages that aren’t turtles all the way down. But your new things never feel like they fit with the things already in the language, just as pairing select and update doesn’t fit naturally alongside something, something=, [], and []= in Ruby. And that means that the thing you build aren’t as discoverable or as readable as they could be.

I started by saying that I like the way you can define lvalues in C++, exactly because you can define methods like select that return generalized variables. But following that thread, I ended up with the conclusion that what I really like is a language with as few special cases as possible, a language which feels like everything is built out of a small set of well-chosen primitives or axioms that I can combine and recombine to create new things that work just like the old things.

A language where it’s turtles all the way down.


  1. Now this essay is about turtles, so I am probably insane for starting with an anecdote that has every potential of blowing up into a debate about the right way to interview people. So I’ll just say a couple of things about that: First, it’s a screening question. If you ask it and then argue over the phone with the candidate about whether adding the const keyword to Java is a good idea, you are missing the point. The point in a screening question is to weed out people who are flat out lying about their experience, not to decide if they happen to be your intellectual and cultural clone. The moment they demonstrate that they actually have some non-trivial experience with the two languages, you are done.

    Second, language nuänce questions have limited value. This is because while there is a correlation between tools knowledge and experience, the argument for causality is tenuous at best. Meaning, people with non-trivial experience usually know a lot more about their tools than people without experience. However, it is possible to learn a lot about a tool without actually having real-world experience. And maybe somebody knows a lot about a tool but doesn’t know the answer to the specific trivial question you want to ask. So… a question like this has some utility as a filter, and perhaps it is useful in a longer interview if you use it as an excuse get the candidate to talk about how their proposed changes would have helped them with their actual projects, which leads them into talking about their experience.
    [back]

  2. You can read about them in On LISP: Advanced Techniques for Common LISP. It’s available on line, or you can download the pdf.
    [back]

  3. Here’s one such argument: Perl, the first postmodern computer language.
    [back]
 

Monday, February 04, 2008
  Off Topic: Raganwald Names Names
Someone asked where I get the links I post is various places (like my weblog’s RSS feed, (ruby|programming).reddit.com, dzone, and news.ycombinator.com). Here’s what I do:

I scan my existing subscriptions and link aggregators like programming.reddit.com looking for articles I like. I like to follow links from links as well, sometimes they are much more interesting than the post that caught your attention. If I like a post I find, I use a Firefox button to tag it in del.icio.us, and that adds it to my RSS feed’s daily links: I have a tag called weblog, and I serve my RSS feed with feedburner. One of the myriad options for feedburner is to integrate del.icio.us links, and I set it up to merge just that tag, not all of my links.

When I like an article I try to quickly have a look at the author’s other writing. In many cases I add the author to my subscriptions. Firefox again, it’s a click or two to add the feed. I will drop feeds after a while if they drive me insane, of course. That’s why I don’t have a whole lot of Apple links, it’s hard for me to stay interested.

Would you believe I am a shameless egocentric? I have some technorati and google searches set up as RSS feeds, looking for mentions of my name or backlinks to my web posts. I almost always post articles that link to me, especially if they strongly agree or disagree.

If it’s a twitter-style post where I am quoted and the author doesn’t say anything else, I usually won’t post it along, although I will have a look at the weblog and I often subscribe to it. I obviously like being quoted, but I like things that keep the ideas rolling even more. (Please don’t stop twittering, but when you have a moment, take a minute or two to share a piece of you with us. And that goes for any follow-up ideas, even (or especially) if they are in the form of pointing out that my weblog is proof that if any idiot can afford a printing press, any idiot will self-publish.)

I also read the comments on this weblog. Some of them include links to interesting posts. I try to check them all out. FYI, when leaving a comment you can use an old-fashioned anchor tag (<A HREF="link.url">link text</A>) to create hyperlinks back to your weblog or any other post you think is relevant. And there’s this thing called email. People email links to me, including links to posts they have just written that they think I might like. I try to read it all.

And that’s where I get my links. Cheers!

p.s. Here’s an opml snapshot of my subscriptions. If your weblog isn’t on it, feel free to drop me a line when you write something I might like reading. I do clean house from time to time, and I may need to be reminded of your writing.
 

  Drama in the North Atlantic
It’s like tying the Titanic to the iceberg. It’d keep you from sinking just long enough to freeze to death.
—Andy Baio’s take on Microsoft buying the sinking Yahoo

Via Daring Fireball, which serves up a cogent analysis. The prognostication: Microsoft will either sell or shutter Yahoo’s products and migrate customers onto Microsoft’s products like Hotmail.

and another…

The Borg-Yahoo merger won't work. Here’s why. It’s like taking the two guys who finished second and third in a 100-yard dash and tying their legs together and asking for a rematch, believing that now they’ll run faster.

Here’s the weird thing: I first heard that line about the 100-yard dash from Ballmer himself, maybe a decade ago.
—Fake Steve Jobs on why Steve Ballmer is out of ideas