Paul W. Homer on The Nature of Simple

(This is a snapshot of my old weblog. New posts and selected republished essays can be found at raganwald.com.)

Sunday, December 23, 2007

Back in [my VMS] days, we could place as many C functions as we wanted into individual files. One developer, while working, came up with the philosophy of sticking one, and only one function into each file. This, he figured was the ‘simplest’ way to handle the issue. A nice side effect was you could list out all of the files in a directory and because the file name is the same as the function name, this produced a catalog of all of the available functions. And so off he went.

At first this worked well. In a small or even medium system, this type of ‘simplified’ approach can work, but it truly is simplified with respect to so few variables that we need to understand how quickly it goes wrong. At some point, the tide turned, probably when the system had passed a hundred or so files, but by the time it got to 300 it was getting really ugly…

Although many of the functions were related, the order in the ‘dir’ was not. Thus similar functions were hard to place together. All of this was originally for a common library, but as it got larger it became hard to actually know what was in the library. It became less and less likely that the other coders were using the routines in common. Thus the ‘mechanics’ of accessing the source became the barrier to prevent people from utilizing the source, which was rapidly diminishing the usefulness and work already put into it…

By now, most developers would easily suggest just collapsing the 300 files into 12 that were ordered by related functions. With each file containing between 10 and 30 functions—named for the general type of functions in the file, such as date routine —navigating the directory and paging through a few files until you find the right function is easy. 12 files vs. 300 is a huge difference.

Clearly the solution of combining the functions together into a smaller number of files is fairly obvious, but the original programmer refused to see it. He had “zoomed” himself into believing that one-function-per-file was the ‘simplest’ answer. He couldn’t break out of that perspective. In fact, he stuck with that awkward arrangement right up to the end, even when he was clearly having doubts. He just couldn’t bring himself to ‘complicate’ the code.

—Paul W. Homer, The Nature of Simple

Just one thought to ponder: sometimes what is simple at one scale is not simple at another. You can use this argument both ways when thinking about idioms and programming paradigms.

You could argue that things like functional programming techniques don’t scale to larger teams. You could also argue that sticking to the lowest-common-denominator patterns works fine for small projects, but you need to employ higher levels of abstraction if you are going to keep a large application with many maintainers flexible and cost-effective.

Hmmm.

¶ 11:45 PM

Comments on “Paul W. Homer on The Nature of Simple”:

What I often want to say is "simplicity is fractal". Then I immediately start worrying that this might not mean what I want it to mean...

I guess what I really mean is that there are only a few strategies that work to bring about simplicity, yet these strategies are applied differently at different scales, yielding different results. As you say, it's easy to think you have the problem licked, only to step back and realize you've only taken care of one level of scale.

# posted by

Laurent Bossavit : 5:03 AM

Wouldn't having several subdirectories with the related functions in them serve the same purpose as having several files with the related functions in them?

# posted by

steve : 5:52 AM

How about embedding tags in the comment lines of each file, then search for those?

You could then search on several criteria at once.

# posted by

DavidM : 11:31 AM

I like steve's comment. Although I don't agree with the specific idea of one C function per file, I think it's surprising how far you can scale an idea with a little thought.

I think an underlying principle of simplicity is that you have to impose some level of organization on the problem, albeit hopefully minimal. If you say one function-per-file, than you're sacrificing some ability to organize the problem.

# posted by

showell : 9:33 PM

<< Home