Make Money Fast!
Kevin Barnes explaining the consequences of
“great” coders1 getting paid far too little:In the end, this means that really great coders will keep getting paid less than they are worth and average ones will keep getting paid more, so the economic benefits of great skill will go primarily to the companies with the best employees and not to the employees themselves.
I have news for anyone outraged or disappointed by this true statement:
all great employees are paid far less than the value they generate, even employees like salespeople that are paid proportionally.
Let me put it in this painfully direct manner: who lives better, the slave in the field, the slave in the house, or the master who owns the plantation?
Purely and simply, the money is in owning your own business. Every time. Without exception.
2 The point is that if you are any good at what you do,
and if you want to earn more money, then you need to found a business. It can be a start up, a consulting business, or even a
night club.
3 That’s how you’ll get paid what you’re worth.
I’m not even close to being the first to point this out. ‘Props’ to Paul Graham
4 for suggesting that starting companies is the best way for twenty-somethings to get paid something close to what they’re worth.
Update: A few people have pointed out that running a business is a skill that is orthogonal to coding. True. And a few people have pointed out that running a business is risky. Also true, and related: running a business without a modicum of business talent is akin to programming without being able to code fizzbuzz.
5 That does not change the basic fact that if you wish to be paid more money, you must successfully run your own business.
Look, I’m not saying anything in life has a guarantee. If I said “A Porsche is faster than a Chevrolet,” you can point out that most people either cannot afford a Porsche or reasonably choose not to purchase one. But that doesn’t change the fact that the Porsche is still faster.
So if you have good reasons for not starting a business, fine. But that does not change the fact that the people who run businesses make more money than the people who work for the people who run businesses.
- Let’s not argue about whether the bike shed ought to be green. If you think that coding is not a valuable skill, that the value is in communication, or architecture, or eliciting requirements, or some other characteristic of great software developers, feel free to apply some white out to your display and write the appropriate word in. Great developers, however you measure great, are underpaid. By the way, I have no evidence that great coders cannot communicate, do not design great architectures, or are unskilled at divining good requirements. So there.
- The one exception I can think of raising would be Michael Milken, who earned more than half a billion dollars as an employee of Drexel Burnham Lambert. But a cursory examination of his circumstances reveals that he was running his own business under the Drexel banner.
- Or possibly not a night club. See the comments.
- Paul’s book, Hackers and Painters: Big Ideas from the Computer Age
is another must-read.
- Actually, getting and keeping a programming job without being able to program is significantly easier than running a business without any talent for it.
Labels: jobs
Haskell, Ruby and Infinity
Languages like Haskell support
lazy evaluation. In principle, you only compute what you actually need, everything else just goes away. If you usually need everything you compute, this may seem like a frill: elegant, even interesting, but having little practical importance. I find it is very much like Tail Call Optimization: if you don’t have it, you code around it. Often it makes no difference, but from time to time there is a case where your code will be clearer and more maintainable if you express yourself succinctly and let the compiler do the work of making it efficient.
This post is a switch from mumbling about
Y Combinators and using trivial cases to explain interesting ideas: I’m going to show some actual code I’m writing. I apologise in advance if this adds so much background that it obscures the message about lazy evaluation: extremely simplistic examples sometimes work against the argument because it is too easy to think of other ways to accomplish the needed results.
I don’t apologise
at all for the unpolished nature of the code. This isn’t a textbook. And besides:
Anybody can say you can’t write. Let no one say you don’t.
—Ken Rand, courtesy of Chalain
I’ve been
hacking some naïve code to cluster data sets.
In computer programming, a hacker is a software designer and programmer who builds elegant, beautiful programs and systems… For some, “hacker” has a negative connotation and refers to a person who “hacks” or uses kludges to accomplish programming tasks that are ugly, inelegant, and inefficient.
—Wikipedia
The clustering algorithm requires a very large, fixed set of curves.
1 I wrote the initial curve generation by building a gigantic list of parameter tuples, and then processing the list into records. Once the “search space” grew beyond a trivial size, the program began to eat enormous amounts of memory. The problem was that I was trying to write the generation code as clearly as possible, and that created a massive thicket of objects that all resided in memory simultaneously.

The curves being generated are paths composed of cubic beziérs. Each segment on the path requires specifications for four different control points.
I rejected the idea of a rewrite into looping, imperative form, I wanted to separate the “list comprehension” code from the “make it run in less than a gigabyte” code. Instead, I chose to use lazy evaluation: the principle is that although the code defines a huge data structure that takes up gigabytes, we only actually evaluate things as we need them, and we discard them when we are done, so the total memory footprint for the lazy form is about the same as the memory footprint for an imperative form.
The easiest way to perform the refactoring—besides a rewrite in Haskell—was to switch all of the arrays I used for a lazy list data structure. A lazy list is a linked list composed of head and tail tuples (known as
cons cells). Where a normal linked list holds an element and the rest of the list, a lazy list holds an element and a function for computing the rest of the list (
SICP calls them “
streams.”)
If a modicum of care is taken not to pin objects down, you can build fantastically large lazy lists and process them at will. In fact, lazy lists can be infinitely large if you provide the appropriate function for generating the list
For that reason, you should never append one lazy list onto another lazy list. Consider
odd = LazyList.unfoldr(1) { |n| n + 2 } and
even = LazyList.unfoldr(0) { |n| n + 2 }. If you append
odd onto
even, the resulting list in theory has both even and odd numbers. But in practice you can never reach the odd numbers because there is an infinite quantity of even numbers at the front of the list.
Instead,
merge them together to produce a lazy union of the two lists. Merge interleaves alternating elements from each list, so the resulting lazy list contains all the elements of each list, and if you take a finite sample, such as the first 1,000 elements, you get 500 of each.
even.merge(odd) => (0 1 2 3 ...). (Other interesting operations that work with infinite lists include
cons,
pairwise and
product.)
Any sufficiently complicated Lisp or Ruby program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Haskell.
My current version of a
LazyList class in Ruby is
here. As soon as I switched from the “eager” code using arrays to the “lazy” code using lazy lists, memory use went down and performance went up. Not as much as a switch to imperative (and not even close to writing a better clustering algorithm), but enough that I can move forward.
Here is an example of a function using arrays to enumerate choices (
distribute(2, 1..4) -> [[1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4]]):
def distribute choices, range
spots = (range.end - range.begin) + 1
return [] if choices <= 0 || choices > spots
remaining_choices = choices - 1
if remaining_choices == 0
return range.to_a.map { |spot| [spot] }
else
(range.begin..(range.end - remaining_choices)).to_a.inject([]) { |acc, first_spot| acc + (distribute(remaining_choices, (first_spot + 1)..range.end).map { |remaining_distribution| remaining_distribution.unshift(first_spot) }) }
end
end
And rewritten with lazy lists:
def distribute choices, range
spots = (range.end - range.begin) + 1
return LazyList.new() if choices <= 0 || choices > spots
remaining_choices = choices - 1
if remaining_choices == 0
LazyList.from_range(range).map { |spot| LazyList.from_element(spot) }
else
LazyList.from_range(range.begin..(range.end - remaining_choices)).mapr(:merge) do |first_spot|
distribute(remaining_choices, (first_spot + 1)..range.end).map do |remaining_distribution|
LazyList.cons(first_spot, remaining_distribution)
end
end
end
end
It’s almost identical.
LazyLists replace arrays,
mapr replaces accumulating a list with
inject, and
cons replaces
unshift, but otherwise it’s the same code. Mission accomplished: we’ve changed the behaviour without having to change the way we express our algorithms. Had we wanted to, we could have supported enough array syntax that Lazy Lists look exactly like arrays, but that isn’t necessary.
And when running the entire generator, the memory footprint is dramatically lower. For example, this small routine quoted above no longer generates an array of arrays: it generates a structure that can generate the lists when needed. Switching the data structure changes the evaluation behaviour of the generator code: it gets us 80% of having an imperative structure without having to tie the code to the implementation.
This is exactly what separation of concerns is all about:
green code here, yellow code there.
Lazy evaluation is the red pillIt’s great to make a change to the code’s behaviour without changing the way we think about the solution to our problem. But you know what? It isn’t
interesting. In fact, if lazy optimization was something we needed on a regular basis, we ought to switch to a
language that does it for us. Implementation details ought to be left to libraries, compilers, and virtual machines. That’s what they’re for.
A language that doesn’t affect the way you think about programming, is not worth knowing.
—Alan Perlis
Now that we hold in our hands a tool for making infinitely long lists, the question to ask is, “how does that affect the way we think about generating sample curves?”
The first thing that comes to mind is to ask why the set of sample curves we generate is finite. It isn’t really finite, there are an infinite number of curves. (And not just any infinity, it’s Aleph One, the children’s book
One Two Three… Infinity
explained that forty years ago.)
What we really want is a
finite sample of that infinite set of curves. Now let’s zoom in and think about a single parameter, the
y value for one of the control points on a curve. Let’s say it’s a rational number between
0.0 and
1.0 inclusive. It’s relatively easy to generate a list of rationals in that range. Say we generate
0.0000...00001 (with jillions of zeroes elided: we want the smallest number our computer can represent), followed by
0.0000...00002 (the second smallest), and so on.
If we leave the computer running for a few months or years, we should get all of the numbers between
0.0 and
1.0 that our computer can represent. That shouldn’t be hard. But even taking into account the finite limitations on a computer’s representation, that list is infinitely long.
Useful samplesWe need to take a finite sample of all of the infinite possibilities in that list.
When working with an infinite list, what we want is that the front part of the list, the part we can access before the heat death of the Universe, contains a useful sample. If we were sampling a list like
(0.0000...00001 0.0000...00002 0.0000...00003 0.0000...00004 ... ) for control points, the first values are not useful at all. We’d just get a line along the
x access.
There are various strategies for generating a more useful list. We could create an infinite list of random values. I’ll outline another method below. But the take-away idea is this: we need lists where the most useful values are first. Working with infinite lists encourage us to think of infinitely sized sets of samples, ordered by usefulness.
Let’s say that when we generate
y, we devise a list like
(0.0 1.0 0.5 0.75 0.25 ... ). We’re successively bisecting the intervals (which is why the library method is called
binary_search). We can stop at any time and we have decent distribution. (And we can pipe dream that in the future, we can train our generator to favour points more likely to be useful to us.) That is much more useful!
Here’s some Lazy List code that does just that: given two seeds and an optional bisection block, it generates an infinite list of values:
def binary_search seed1, seed2, &block
bisection_block = block || lambda { |x, y| (x-y)/2 + y }
LazyList.cons(
seed1,
LazyList.new(
seed2,
delay {
bisect(seed1, seed2, &bisection_block)
}
)
)
end
# infinitely bisects a space, exclusive of the seed values
def bisect seed1, seed2, &block
return EMPTY if seed1 == seed2
half = block.call(seed1, seed2)
LazyList.new(
half,
delay {
merge(
bisect(seed1, half, &block),
bisect(half, seed2, &block)
)
}
)
end
And indeed,
LazyList.binary_search(0.0, 1.0) -> (0.0 1.0 0.5 0.75 0.25 ... ).
The Grand Hotel
Satan, Cantor, and Infinity and Other Mind-boggling Puzzles
is a five-star introduction to Cantor’s work on infinity, including a special treat: a completely different proof that Aleph One is greater than Aleph Zero based on games, very much in the style of Conway’s Surreal Numbers. Currently out of print, but
please get yourself a used copy from a bookseller: you won’t be disappointed!
What happens when we try to combine two infinite lists? For example, we want a list of
(x, y) tuples, generated as the Cartesian Product of our binary search list with itself. If we use a typical breadth-first or depth-first algorithm, we’re sunk. We get
( (0.0 0.0) (1.0 0.0) (0.5 0.0) (0.75 0.0) (0.25 0.0)... ). That has a nice distribution along one axis but still useless. What we need is a way of combining two infinite lists such that they give us a nice useful sample when we take a finite number of elements off the front.
Here’s a table describing how the ‘breadth-first’ algorithm for generating the product of two lists works. Consider two lists of three elements each. In the table below, columns represent one list and rows the other. The numbers represent the order that the product of the lists is generated:
| Breadth-first mapping from elements to integers |
|---|
| 1 | 2 | 3 |
| 4 | 5 | 6 |
| 7 | 8 | 9 |
This works fine for finite lists, but as we saw above it is useless for infinite lists. But looking at it in table form is useful. The product of two lists
is tabular, the problem with a traditional algorithm is the order that we select elements from the table, not that the elements are tabular. Just as we solved the problem of how to sample
y magnitudes by changing the order in which we selected rationals between
0.0 and
1.0, what we need to do with Cartesian Products, what we need to do is select the products in an order that provides useful results.
As it happens, there’s an order that generates useful finite samples when dealing the product of two infinite lists: Instead of generating rows and appending them to each other, it generates
diagonals and
merges them with each other. The first diagonal is
1, 5, 9. The second is
4, 8. The third is
2, 6. And so on. The advantage of this algorithm is that if given two infinite lists, it starts in the upper left-hand corner and supplies values, working its way right and down.
2Here’re the first four values of the product of two binary searches with each other:
[{:x=>0.0, :y=>0.0}, {:x=>0.0, :y=>1.0}, {:x=>1.0, :y=>0.0}, {:x=>0.0, :y=>0.5}]. And the next four:
[{:x=>1.0, :y=>1.0}, {:x=>1.0, :y=>0.5}, {:x=>0.5, :y=>0.0}, {:x=>0.0, :y=>0.25}]. And the next eight:
[{:x=>0.5, :y=>0.5}, {:x=>0.5, :y=>0.25}, {:x=>0.5, :y=>1.0}, {:x=>1.0, :y=>0.25}, {:x=>0.25, :y=>0.25}, {:x=>0.25, :y=>0.75}, {:x=>0.25, :y=>0.0}, {:x=>0.0, :y=>0.75}].
We can take as many samples as we want, and more sample give us more “resolution.” We now have the tools we need to get rid of all of the limits and finite combinations and focus on describing what we’re trying to generate without intermingling a lot of generating code.
And I’m thinking about eternity
Some kind of ecstasy got a hold on me
—Bruce Cockburn, Wondering Where The Lions Are
Now that we can generate an infinite list of useful magnitudes, plus we can combine infinite lists in a useful way, we can define an infinite list of sample curves. Here’s the exact definition of an infinite list of partial beziérs (there are only three control points because the origin of the beziér is always the terminus of the previous beziér in the curve we are generating):
def sample_cubic_beziers
magnitudes = LazyList.binary_search(0.0, 1.0)
control_points = LazyList.cartesian_product(magnitudes, magnitudes) do |x, y|
{ :x => x, :y => y }
end
LazyList.cartesian_product(control_points, control_points, control_points) do |p1, p2, p3|
{
:p1 => p1,
:p2 => p2,
:p3 => p3
}
end
end
That’s really simple, and really easy to understand.
3 The domain-specific code that defines a cubic Beziér is concise and readable. And the infrastructure code that handles combinatorics is shunted away out of sight where it doesn’t clutter up our thinking.
So what have we seen?
- Lazy evaluation solved a performance problem without needing an extensive rewrite of our generation code;
- Taking the red pill, infinite lists allowed us to radically simplify the original code.
Are there some opportunities to steal an idea like lazy evaluation from other languages and use them to simplify your code?
- The data is a set of
(x,y) tuples where x is time and y is a magnitude. The effort remaining in a sprint backlog is an example of this kind of data set. I need a function that computes the distance between any two data sets. I already have a function that computes the distance between a curve through the xy space and a data set, so if I build a lot of curves and then take the distance between each curve and each data set, I can find the distance between pairs of data sets by searching for the curve with the lowest total distance.
This operation is On2, but that’s why we invented distributed processing. Also, this operation is done infrequently relative to operations making use of the clusters, something like Google’s indexing being far less frequent than searching Google. And of course, when my MacBook overheats and burns its way through my desk, I can come back and replace the brute force operation with something more intelligent.
Well, all this requires generating the sample curves. I have code that makes curves out of Beziérs, provided I supply coördinates for the control points. the curves are actually ActiveRecord models. So all I needed was some simple code that generates all of the possible combinations and then write them to the database. It’s the kind of thing candidates write in job interviews where employers actually care about their stone cutting skills.
- This problem is congruent to the proof that the set of all points on a plane is the same size as the set of all points in a line.
- And it could be simpler yet: a “list comprehensions” DSL for lazy lists
is on the wish list would make things far more readable.
Labels: lispy, ruby
Programming Language Stories
CA fellow needs a dog house, so he hires a contractor. This guy is a C programmer who’s between jobs at the moment, and he says forget those wimpy “agile” wooden dog houses, he can build the dog house out of bricks, it will last forever and is stronger than any wood structure.
The client is doubtful, but that sounds very impressive, and the contractor says this is how all the real-world dog houses are made, so he agrees. Well, it takes forever: it seems every time the contractor thinks it’s finished, they discover there’s a chink between the bricks and there’s a leak. With wood you just trim to fit, but with bricks you have to plan everything in advance so that the bricks line up just right.
But Spring turns to Summer turns to Autumn, and the job is done. The client pays up, and the contractor’s about to leave when he stops and thinks for a moment before speaking.
“Hey,” the contractor says, “You paid for a pallet of bricks, and I used all but one. It’s a shame if it goes to waste. Why don’t I see if I can use it somewhere?” The client asks like what, and before you know it the contractor has scored a contract to build a brick path for the dog from the back door to the dog house. He orders some more bricks, gets to work right away, and this time things go a little better.
The contractor is now doing C++ on the side. And he uses some new bricklaying techniques he learned from
Effective C++
: there are these funny angle braces everywhere thanks to his path building templates. Soon that job is done and the contractor collects his fee. He’s about to leave, but he reaches into his tool chest and pulls out a brick.
“Hey,” the contractor says, “You paid for another pallet of bricks, and I used all but one on the path. It’s a shame if it goes to waste. Why don’t I see if I can use it somewhere?” The client is happy with the path, asks like what, and before you know it the contractor has scored another contract, this time to build a brick wall enclosing the yard. He orders some more bricks, and gets to work right away.
Meanwhile, the contractor has become a Visual C++ programmer and reads MSDN magazine religiously. He applies his new skills to building the wall: he has all these fancy tools that can put up sections of scaffolding when you push a button. of course, there are huge holes you have to fill in yourself, and the scaffolding has all these crazy joins and angles, it’s hard to change anything or understand how the entire wall fits together, but it feels like you’re getting a lot done because the system seems to move a lot of bricks for you.
Well, the wall takes forever, and it doesn’t look that good when it’s finally done. But the client pays up and the contractor is ready to leave. But he stops for a moment and pulls a brick out of his tool chest.
“Hey,” the contractor says, “You paid for two more pallets of bricks for the wall, and I used all but one. It’s a shame if it goes to waste. Why don’t I see if I can use it somewhere?”
The client has had enough of this, and he decides he’s going to show the contractor how he feels. “Give me that brick,” he orders. The surprised contractor hands it over. “I don’t give a rat’s ass about this brick” the client huffs, and he throws the brick clear out of his yard.
“I need to get work done, not talk to contractors all day about bricks. And besides, I’ve talked to the big landscape contracting companies, and they all say nobody is using bricks, everyone has switched to sharp tools and netting. Now get out.”
Ruby on RailsJust inside the gates of heaven, St. Peter sits at a desk checking people in.
Peter: “Welcome to heaven. Programming language?”
The man at the front of the line says, “
Smalltalk
.”
Glancing at his clipboard, Peter says, “Room 33. Be very quiet as you pass Room Six.”
The process repeats itself with the next person in line:
Peter: “Welcome to heaven. Programming language?”
Person #2: “
Common Lisp
.”
Peter: “Room 17. Be very quiet as you pass Room Six.”
The next person moves to the front of the line with a look of curiosity on her face.
Peter: “Welcome to heaven. Programming language?”
Person #3: “
Python
.”
Peter: “Room 54. Be very quiet as you pass Room Six.”
Person #3: “Why do you keep telling us to be quiet as we pass Room Six?”
Peter: “Because the
Ruby on Rails
People are in Room Six, and they think they’re the only ones here.”
—Snarfed from Eric Sink’s Baptists and Boundaries
JavaA mother was preparing the pot-roast for Sunday’s big family dinner. Before searing it and placing it in the pan, she carefully sliced the ends off. Her three year-old daughter asked “Mommy, why do you cut the ends off the roast?”
She answered, “My mom taught me to do it that way, and it’s delicious, so it must be a good idea. Maybe the juices from the meat mix with the vegetables?”
Everyone sits down for dinner, and when Father is serving the roast, the daughter remembers her question. She turns to Grandma and asks, “Grandma, why did you teach Mommy to cut the ends off the roast?”
Grandma thinks for a moment and says, “What a delightful question! I always used to cut the ends off the roast, it’s how
my mother taught me. I don’t know why she did that, there must have been a good reason.” Grandma sits for a moment, remembering her mother. “Well,” she continues, ” there must have been a good reason. Now eat your dinner before it gets cold!”
The holidays roll around, and they’re having dinner at Grandma and Grandpa’s house. “Hey!” Grandma says to the little girl, “You know what, I was going through your great-grandmother’s things, and I found the old roast pan. We sure made some good pot roast in it. Let’s make pot roast with it tonight in her honour!”
They get out the pan and wash it up. It’s old, and well-seasoned. Grandma looks it over. “It’s smaller than I remember. I was a little girl, and everything looks bigger when you’re small!”
They put the roast on the counter before searing it. Just then, Grandpa walks by. “Do you want me to cut the ends off the roast?” he asks. “It’s the only way you’ll get it to fit in that small pan.”
Grandma and the little girl look at each other. And they smile.
SchemeA fellow is on vacation, and he decides to ride one of those “scenic” double-decker buses that drive around from attraction to attraction. The upper level is open-air, and he’s the only one up there. He looks around, and decides that instead of re-reading his copy of
The Little Schemer
, he’s going to really enjoy himself: he lights a cigar.
No sooner has he lit the cigar when a woman comes huffing and puffing upstairs, carrying one of those sub-miniature dogs people use as fashion accessories. She takes one look at our man and demands he put the cigar out: it is
offensive to the dog.
“Madam,” our man asserts, “this cigar is not offensive. Let me tell you its secrets.”
“Eschewing inferior mass-market tobacco, I order whole, functional leaves—at great expense and with difficulty—from a small collective in Cuba that still grow cigar tobacco the old-fashioned, original way.”
“Unsullied by automatic rolling machines or other push button devices, I roll each cigar, myself, by hand. Using secret tobacco-macro techniques that are impenetrable to the typical ignorant cigarette smoker, I am able to pack as much smoking pleasure into a single cigar as would take a carton of cigarettes—nay, it is impossible to enjoy a smoke from regular cigarettes as much as I enjoy the elite, exclusive flavour of my cigars.”
The woman is not impressed. “If your cigars are so good, how come nobody smokes them?” she asks. “If your cigars are so good, how come so many clubs and restaurants permit cigarettes but forbid cigars?”
Our fellow is so irate he begins to splutter. “If you or your dog are unable to understand that the purity, the perfection achieved long ago on a remote island far from consumerism and mass productio—” But he is unable to finish, for the woman imperiously sweeps the cigar from his mouth and flings it over the side of the bus.
“That’s what I think of your pompous cigar.” She tells him.
He glares at her for a moment, then years of oppression, of being forced to live with the hated cigarettes against his will, bring him to a boil and he commits an unthinkable act. With a wrench, he seizes the dog and flings it after his cigar. There is a moment of uncomprehending shock, and then the woman begins to howl.
Well, the bus driver hears all this and slams on the brakes. For a moment everything is higgledy-piggledy, and just as the woman’s wails return to their full volume, everyone is shocked to see the dog come trotting up, unharmed.
And what do you know: the dog has—nestled gently in its mouth—the most unexpected thing:
Gur Oevpx.
If you enjoyed these stories, you may also like
Why You Need a Degree to Work For BigCo and more recently,
A Venture Capitalist passes away peacefully, and....
Labels: java, lispy, ruby
A complete pain
I’m not against types, but I don’t know of any type systems that aren’t a complete pain, so I still like dynamic typing.
—Dr. Alan Kay on The Meaning of “Object-Oriented Programming”
Program in Java? You must be joking!
The Y combinator design pattern in Java is easily understood and can be used and maintained by unskilled, entry-level programmers.
Cuius rei demonstrationem mirabilem sane detexi. Hanc marginis exiguitas non caperet.
You know, this kind of joke seems to rile Java apologists to no end. They come out of the woodwork with their web browsers set to
flame. Absolutely no criticism of the language that powers everything from the web to space exploration (although there is never any talk of toasters) is allowed.
And
do not criticize the culture in any way whatsoever. Although it’s perfectly ok to boast that Java is designed to appeal to the widest possible diversity of skill levels, you
may not suggest that Java programmers are stupid. Or else.
I am trying not to tell anyone what to do, but I have an observation. Have you ever heard a politically incorrect but extremely funny joke about a member of a particular culture? You know you have. And furthermore, the joke was probably told by a member of the group victimized in the joke.
The unwritten rule is, if you’re a member, jokes are
fair game. I was once with some, ahh, members of a religion that directly predated and evolved into Christianity. They were telling some jokes about their culture and religion. I had heard a few such jokes, but I wisely refrained from telling any. It’s not allowed. That’s the rule: outsiders may not joke.
So back to Java. Guess what? I’m an
insider. I write Java code every working day. If you care about hollow appeals to authority, I once wrote a Scheme implementation in Java. I was also the team lead for JProbe Threadalyzer, a tool that analyses multi-threaded Java behaviour. And the development manager for JProbe Server-Side Suite (the aforementioned thread analyser, plus profiling, code coverage, and memory debugging tools). And various J2EE implementations with various degrees of Enterprisy-ness.
None of that makes me an expert. Nor does it make me right when I criticize or joke. But it does give me a certain smug
right to joke. And don’t we all need a laugh from time to time, even if we’re laughing at ourselves? Perhaps
especially if we're laughing at ourselves?
I think it’s a sign of good health to be able to laugh at ourselves and criticize our foibles. For all of the talk of Java as a “mature platform,” don’t you agree that “not taking criticism well” is a little, well,
immature?
So since it’s Friday:
A Muslim, a Vegetarian, and a Java Programmer are traveling by foot, and they stop at a farm house to sleep for the night. The farmer is impressed at the obvious sophistication of the Java Programmer’s tales of Enterprise wonder, and he invites her into the house. The Muslim he sends to the hayloft, and the Vegetarian can sleep in the barn.
Well, the farmer is just pouring a night-cap and listening to the Java Programmer describe the time she knocked together a farm workflow application in less than a million lines of XML configuration code when there’s a knocking on the door.
He opens the door and the Vegetarian is standing there. “I’m sorry,” the Vegetarian apologizes, “But you slaughter animals in the barn, and eating meat is offensive to my beliefs. I cannot sleep in the barn.” The farmer thinks this is bunkum, but he was raised to be courteous to his guests, so he asks the Vegetarian to swap places with the Muslim.
The farmer knocks back his drink and turns down the lights. He can hear the Java Programmer setting up a sleeping bag factory to generate down-filled singleton sleeping containers in the living room. His wife is reading in bed, and he’s looking forward to catching up on the Wall Street Journal.
Well, he is just about to climb into bed when there’s a banging on the door. He opens the door, and the Muslim is standing there. “I’m sorry,” the Muslim apologizes, “But you keep pigs in the barn, and pigs are profane according to my beliefs. I cannot sleep in the barn.”
Muttering, the farmer rouses the Java Programmer off the couch and asks her to switch with the Muslim. He climbs into bed and has just started to read an interesting article on hedging commodity futures with convex derivatives when there’s a thunderous hammering at the door. His wife tells him to stay put and she goes to answer it. The farmer hears some excited talking, and a moment later his wife is at the bedroom door.
“Honey,” she says, “it’s the pigs.”
Labels: java, popular
Off Topic: Eliding Raganwald
My editing “policy”:
(Please imagine me making those oh-so-last-century quote marks with my fingers as I say the word
policy.)
I deliberately post as soon as I think an idea is finished enough to communicate the gist. Then, as time and inclination allow, I polish my posts. Sometimes readers provide some excellent feedback, and that gets worked in as well. To me, this reworking is a benefit of the medium. If you read and re-read my posts, especially in the first seventy-two hours, you’ll see the changes.
So far, this is probably not contentious. However, there is a prevailing style amongst bloggers to mark their edits, perhaps by
striking out words they have elided.
update: Or by noting that new words are, well,
new.
There are good reasons for that style. The primary place where people insist on it is when there’s some emotionally charged conversation, and someone retracts something they’ve said. The idea seems to be that if you just delete wrong words, you’re pretending you never said them, which is somehow hypocritical and reprehensible.
I do not subscribe to this view, and I do not hold others to it. My intent with my writing is to present as coherent a view as possible. Vast tracts of
strikeout and insertions can degrade a post.
That being said, you will find stricken words and updates scattered throughout the site. I use this style when I am trying to portray a thought process. For example, I recently described an unexpected behaviour and admitted I didn’t know what was going on, although I speculated as to what might be causing the behaviour. A reader managed to explain the behaviour in a comment. I struck out my original words and added an explanation. That particular post was about unexpected behaviours, so showing the process of going from not knowing what was going on to having someone else explain it added value to the post.
Nevertheless, there is no overarching consistent policy around edits. If recording my words exactly as I originally said them is important, someone ought to donate more funding to the
wayback machine so that it can achieve a finer granularity of archiving.
Is this a disclaimer?
Of a sorts. I fully expect that sooner or later someone is going to write an angry comment saying that I changed my words and didn’t keep the originals around as a scarlet letter of shame. Knowing my (bad) habit of writing emotional posts before I’ve had my third coffee, it’s inevitable that I’ll write something I’ll regret enough to retract.
I don’t expect that writing this will get me off the hook when that happens. I’m not saying to you, “no fair criticizing me for doing this, it’s in the fine print.” I’m writing this to explain, not to mitigate.
Ok, thanks for your patience with this wildly off-topic post. Time for my second coffee and to return to working with a team of programmers a cut above
JavaSchools idiots mediocrity certification-happy and buzzword-driven programmers the rest.
;-)
Why Ruby is not an acceptable implementation of Lisp
The other day, while actually writing software (as opposed to
talking about writing software), I wrote the following:
distance = lambda do |given_scale|
(magnitudes.map { |pair|
(pair.evidence - (pair.flow * given_scale)) ** 2
}).inject { |acc, distance| acc + distance }
end
If you are curious,
distance is a function that computes the distance between a curve (called a ‘flow’) and some evidence, given a float that scales the flow. It’s used for finding a curve that ‘looks like’ a set of data points.
Can you spot the bug? I’ll expunge all of the excess and give you some code you can try yourself:
require 'test/unit'
class TestRubyNames < Test::Unit::TestCase
def test_lambda_clobbering
foo = lambda { (1..3).map { |foo| foo ** 2 } }
assert_nothing_raised(NoMethodError) { foo.call() }
assert_raise(NoMethodError) { foo.call() } # WTF?
end
def test_local_clobbering
foo = [1, 4, 9]
bar = lambda { (1..3).map { |foo| foo ** 2 } }
bar.call
assert_equal(3, foo) # WTF?
end
end
Is this what you expect? Ruby
looks like a nice, block-structured language with lexical scope. But its standard implementation has serious issues. I believe these are
well-known and understood issues, but issues nonetheless.
One more example. I swore I wouldn’t write anything else about fixed-point combinators for a very long time, but try this exercise at home. Here’s my
curry combinator:
require 'test/unit'
class ExempliGratia < Test::Unit::TestCase
CURRY = lambda { |f, a| lambda { |*b| f.call(a, *b) } }
def test_clean_up_loose_ends
maker = lambda { |f|
lambda { |func_with_me| CURRY.call(func_with_me, func_with_me) }.call(
CURRY.call(lambda { |inner_func, me, *args|
inner_func.call(CURRY.call(me, me)).call(*args) }, f)) }
factorial = maker.call(
lambda { |f| lambda { |n| n.zero? && 1 or n * f.call(n-1) } }
)
assert_equal(120, factorial.call(5))
end
end
Looks good. Now let’s prove that
CURRY is referentially transparent by replacing every
CURRY variable with the lambda:
maker = lambda { |f|
lambda { |func_with_me| (lambda { |f, a| lambda { |*b| f.call(a, *b) } }).call(func_with_me, func_with_me) }.call(
(lambda { |f, a| lambda { |*b| f.call(a, *b) } }).call(lambda { |inner_func, me, *args|
inner_func.call((lambda { |f, a| lambda { |*b| f.call(a, *b) } }).call(me, me)).call(*args) }, f)) }
factorial = maker.call(
lambda { |f| lambda { |n| n.zero? && 1 or n * f.call(n-1) } }
)
begin
factorial.call(5)
puts 'fine'
rescue SystemStackError
puts 'wtf?'
end
Hunh? Ruby
looks like a nice language supporting anonymous functions, but sometimes they work and sometimes they don’t.
I strongly suspect that this issue has something to do with the fact that with the ‘naked’ lambdas we’re creating new functions with every call, whereas the version using a CURRY variable re-uses the same function, so the system stack overflows in one case and not the other.Update: Mike Harris proved that this second problem is the same as the first problem: once again, variables are clobbering each other all over the place. In Ruby 1.8, blocks do not create new scope,
at all.
Here's Mike's submission, indented to fit:
def test_full_anonymity
assert_equal(120, (lambda { |f0|
lambda { |func_with_me|
(lambda { |f1, a1|
lambda { |*b1| f1.call(a1, *b1) }
}).call(func_with_me, func_with_me)
}.call(
(lambda { |f2, a2|
lambda { |*b2| f2.call(a2, *b2) }
}).call(
lambda { |inner_func, me, *args|
inner_func.call(
(lambda { |f3, a3|
lambda { |*b3| f3.call(a3, *b3) }
}).call(me, me)).call(*args)
}, f0))
}).call(
lambda { |f| lambda { |n| n.zero? && 1 or n * f.call(n-1) } }
).call(5))
end
Stop your sobbingI’m
not saying Ruby is an unacceptable Ruby.
I’m
especially not saying “See, that’s why we should stick with language X.” If language X doesn’t even try to support this kind of programming, it’s a lame argument to say that it’s a better choice. That’s like saying that your 1972
Pinto is safer than my 2004 MX-3 because the Pinto doesn’t have air bags, and air bags can injure small children if they are sitting in the passenger seat.
If you try to bend Ruby to do Scheme-ish things, you will run into some corner cases, some places where it does the unexpected. Can you figure them out and avoid them (some people will say avoiding corner cases makes for a
Leaky Abstraction)?
I think if you’re writing Ruby code in Ruby, you can easily avoid these problems. That’s why Ruby is an acceptable Ruby.
Why isn’t Ruby an acceptable implementation of Lisp?But why isn’t Ruby an acceptable implementation of Lisp? (Besides the lack of support for
hygienic macros, of course!)
The problem is that Lisp or Lisp-style programming is all about metaprogramming. Code that writes code. Code that evals code. It’s easy to look at something like the curry combinator and say “no programmer would ever write that convoluted a function in production.” But macros and code generators routinely combine snippets of code to produce monstrosities that a human wouldn’t even considering writing.
When writing a piece of code that writes code, you are heavily dependent on the underlying implementation being regular and corner-case free. The whole point of writing code-writing code is that you intend for it to generate variations and combinations on your templates or scaffolds. And it’s only a matter of time before you are combining template A with generator B and meta-programming method C.
When the underlying language is robust, none of this matters. The result works. If you look at it under the hood, you’re horrified at the result. But frankly, that isn’t an argument against meta-programming, it’s an argument
for it: let the machine do the dirty jobs we don’t want to do, and let us deal with a nice abstraction like
A plus B plus C that just works, never mind what the code looks like.
(Don’t even think about arguing that we should do away with code that writes code. That debate was won by the code generation folks back when COBOL and FORTRAN and ALGOL were young. From time to time you’re welcome to pull out your assembly if you like, but we use High Level Languages because we want the machine to write the messy crap for us.)
My fear is that code that writes code might one day write code that clobbers a variable.
Or code that writes code might write some nested lambdas that overflow the stack.And that’s why Ruby is not an acceptable implementation of Lisp.
Well, ok, I confess: the title is a bit of muckraking.
Ruby is a pretty good implementation of Lisp, and anyone who has two brains to rub together can point out that the issue is not whether there’s some corner case “gotcha,” but
whether using Ruby is, on the whole, more beneficial than not using Ruby.
For what it’s worth, I consider corner cases like this a strong endorsement of trying new languages like Ruby: it’s like finding out that your sailplane has a tendency to go into an inverted flat spin when you loop at an unacceptably slow speed. It’s terrifying when it first happens, but you get through it and then you realize that the only reason you got into trouble was that you were pushing yourself hard.
Labels: lispy, ruby
Guest Blogger: Tom Moertel derives the Y combinator in Ruby
In a comment on But Y would I want to do a thing like this?, Tom Moertel derived an elegant version of the actual Y combinator in Ruby. I reproduce his words here with permission.
Tom uses the alternate syntax for call,
[...]. I avoided this to keep clear of a tiresome debate about operator overloading. (On one side: those who hate it. On the other side: those that hate Ruby disallowing overloading (...)).
Ok Tom, take it away!
Here’s my original Ruby definition:
def y(&f)
lambda { |*args| f[y(&f)][*args] }
end
To get rid of the recursive call
y(&f), let’s replace it with a variable
yf, whose value we’ll say is magically equivalent to the result of the recursive call. To effect the binding of
yf, we’ll need to wrap the method body in a new lambda expression and pass in the magic value:
def y(&f)
lambda { |yf|
lambda { |*args| f[yf[yf]][*args] }
}[ <<< placeholder for yf: we'll fill this hole next >>> ]
end
So, what value should we pass in for
yf? Because the variable represents the value that would have been computed by calling
y(&f), which is just the body of the
y method, we can fill the hole with a duplicate of the body:
def y(&f)
lambda { |yf|
lambda { |*args| f[yf[yf]][*args] }
}[ lambda { |yf| lambda { |*args| f[yf[yf]][*args] } } ]
end
And we have our Y combinator:
fac = y { |rec| lambda { |n| n < 2 ? 1 : n * rec[n-1] } }
fac[5] # ==> 120
While the combinator works, its implementation repeats itself: there are two copies of the lambda expression
lambda { |yf| lambda { |*args| f[yf[yf]][*args] } }
in the method body. To factor them out, let’s let the variable
x represent this expression. Then the method body simplifies to
x[x], which is very satisfying. To set
x to the correct value, however, we need to add a wrapper lambda that binds
x to the original expression. Putting it all together, we arrive at the final version:
def y(&f)
lambda { |x| x[x] } [
lambda { |yf| lambda { |*args| f[yf[yf]][*args] } } ]
end
And it, too, works:
acc_fac = y { |rec| lambda { |n,acc| n < 2 ? acc : rec[n-1, n*acc] } }
tail_fac = lambda { |n| acc_fac[n, 1] }
tail_fac[5] # ==> 120
Cheers! —
TomLabels: lispy, ruby
But Y would I want to do a thing like this?
Choose life. Choose a job. Choose a starter home. Choose dental insurance, leisure wear and matching luggage. Choose your future. But Y would I want to do a thing like that?
Writing about
first-class functions and their
compatibility with object-oriented programming naturally leads to the
Y combinator. And that is the point where eyes glaze over and soft, snoring sounds rise from RSS readers everywhere.
But please bear with me, this essay is not really about the Y combinator, it’s about learning new things and expanding our capacity to think.
Sharpening the sawYears ago I picked up Steven Covey’s book
The 7 Habits of Highly Effective People
. If the book is a test out of seven, I really wasn’t doing very well.
If you’ve read the book, you probably remember that he talked about “Sharpening the Saw,” investing in your own abilities. That’s incredibly important, but I don’t need to tell you that. If you exercise with programming katas, or learn a new programming language once a year, or pick up a book like
The Reasoned Schemer
and actually go through the exercises, then you are already in the top 1% of software developers for personal skills improvement. (Sorry, certifications
don’t count. They are the classic case of doing the wrong thing for the wrong reason!)
New ideas—by which I mean, new to you—are an important way to sharpen your saw. If for no other reason than this: the brain needs regular exercise to perform at or near its potential. Learning new things keeps you sharp, even if you don’t directly use the things you learned.
Others have suggested that learning Lisp is beneficial to your programming skills in its own right. That’s one good way to sharpen your saw. But I add to that an important caveat: to obtain the deepest benefit from learning a new language, you must learn to think in the new language, not just learn to translate your favourite programming language syntax and idioms into it.
Think differentThe interesting thing about that is that almost by definition, if you see something in, say, Lisp that solves a problem you already have, you won’t learn much from the Lisp code. It is tempting to think that Lisp (or any other language) will somehow do what you’re already doing in some wonderfully magic way that is obviously better. But no, that isn’t how it really works.
For your problems are tuned to your existing tools. You simply can’t imagine problems that your tools can’t solve well, much less can’t solve at all. That’s why there are so few continuation-based web servers. Who’s going to invent one unless they have a programming paradigm with continuations?
And worse, when a new tool is applied to a problem you think you know well, you will probably dismiss the things the new tool does well. Look at how many people dismiss brevity of code. Note that all of the people ignore the statistics about the constant ratio between bugs and lines of code use verbose languages. Look at how many people dismiss continuation-based servers as a design approach. Note that all of them use programming languages bereft of control flow abstractions.
Thus, to truly learn a new tool, you must not just learn the new tool, you must
apply it to new kinds of problems. It’s no good learning to replace simple iteration with first-class functions: all you’ve learned is syntax. To really learn first-class functions, you must seek out problems that aren’t easily solved with iteration.
The Why of YWhich leads me back to fixed point combinators. They appear to have no practical (as in making money) use. And that’s why I’m suggesting to you that you figure out how to make one in your language of choice. The very fact that the problem is far outside of your realm of “practicality” guarantees that you will learn something. You won’t be simply applying your same-old, same-old techniques and patterns to a slightly new problem.
Start your research with Richard P. Gabriel’s
The Why of Y. Try porting his examples directly to your favourite programming language. If what you want to use is too brain-damaged to support closures, you may need to do a little greenspunning and build a little
functor infrastructure.
Don’t be dissuaded if you have to follow the functor route: you are learning far more about your language and about programming in general than the shmoes that settle for learning five new buzzwords related to the latest WS-* interoperability with XPath 3.x.
If you prefer a
fun approach to learning, you can do not better than Raymond Smullyan’s
To Mock a Mockingbird
: an enjoyable romp through the world of combinatory logic. After reading this book, you will have mastered the S, K, I, Y, and other combinators. Added bonuses include a safe that can only be opened by applying Gödel’s Incompleteness Theorem to its combination. How can you read this book and
not learn?
Eating my own dog foodI thought of a few things to say along these lines last week and then I abruptly realizing I was asking you to “Do as I say, not as I do.” What good is recycling problems I first encountered in University textbooks two decades ago? I put this post aside and set to work on a problem of my own.
I set out to write a function for making recursive functions—a function
extensionally equal to the Y combinator—in Ruby. The ultimate goal is to take something like:
lambda { |f| lambda { |n| n.zero? && 1 or f.call(n-1) } }
And be able to have it call itself recursively. In this case, to compute the
factorial function.
This is trivial, given that Ruby supports named recursion, but if you want to write a fixed-point combinator you want to write a function that makes recursive functions
without using the host language’s support for named recursive calls. In other words, you are bootstrapping named recursion out of anonymous first-class functions.
1There are important theoretical implications of being able to do this, but the killer reason to try it is to learn.
I started my quest for a function for making recursive functions with a rather trivial observation based on OO programming and the Curry function:
require 'test/unit'
class ExempliGratia < Test::Unit::TestCase
CURRY = lambda { |f, a| lambda { |*b| f.call(a, *b) } }
def test_recursive_curry
maker = lambda { |func_with_me|
CURRY.call(func_with_me, func_with_me)
}
assert_equal(120, maker.call(lambda { |me, n| n.zero? && 1 or n * me.call(me, n-1) }).call(5))
end
end
In OO some language implementations,
this (or
self) is a hidden parameter passed to each method. Thus, there’s a parameter—
me in the example code—that is added for handling recursion. If you write a recursive function—like the venerable
factorial—with the extra
me parameter, a trivial currying operation evaluates it recursively without any need for names.
This is obviously deficient. As noted above, we want to write
factorial like so:
lambda { |n| n.zero? && 1 or f.call(n-1) }
We’ll need an
f from somewhere, and just as our Scheme colleagues do, we’ll bind one as a parameter in an enclosing lambda. So we want to write:
lambda { |f| lambda { |n| n.zero? && 1 or f.call(n-1) } }
And somehow this should be transformed into a working
factorial function. For the test-driven crowd, we want to write:
def test_clean_up_loose_ends
maker = ...
factorial = maker.call(
lambda { |f| lambda { |n| n.zero? && 1 or n * f.call(n-1) } }
)
assert_equal(120, factorial.call(5))
iterative_factorial = maker.call(
lambda { |f| lambda { |n, acc| n.zero? && acc or f.call(n - 1, n * acc) } }
)
tail_factorial = lambda { |n| iterative_factorial.call(n, 1) }
assert_equal(120, tail_factorial.call(5))
end
Of course, we need some code for
maker. And the
iterative_factorial case shows that
maker works for functions with more than one parameter. The solution I came up with is:
CURRY = lambda { |f, a| lambda { |*b| f.call(a, *b) } }
maker = lambda { |f|
lambda { |func_with_me| CURRY.call(func_with_me, func_with_me) }.call(
CURRY.call(lambda { |inner_func, me, *args|
inner_func.call(CURRY.call(me, me)).call(*args) }, f)) }
The source code with each transformation from beginning to end is
here (I strongly suspect that this “curry combinator” is actually the Y combinator with a huge amount of cruft hanging off it).
Unique or derivative, crap or craft, the process of getting it to work has enriched my mind by forcing me outside of my usual problem space. I still can’t think of a practical application for what I’ve just written. But I know I’ve stretched myself.
And now back to you: perhaps you’re rushing off to try to implement a fixed-point combinator from first principles. Perhaps your plan is to code the canonical examples in your usual language. Those are both good paths. But whether you follow them today or not, remember the underlying principle exemplified by the fixed-point combinator:
Do not dismiss impractical or weird problems. While you may not have an immediate application for the code you write to solve such problems, you are maximizing your learning when you venture outside of your usual problem space.
- Named recursion is stuff like
foo = lambda { |...| foo.call(:bar) }. It takes advantage of the host language’s variable binding to recurse. If you want anonymous recursion, you should be able to assign the same lambda to another name and have it work just as well, as in: fufu = lambda { |...| foo.call(:bar) }. That won’t work if you are relying on Ruby’s name for foo.
p.s. Don’t miss
Tom Moertel’s derivation of the Y combinator in Ruby.
Labels: lispy, passion, popular, ruby
It's Monday, it's snowing, and I'm actually happy to be at work!
I found this link on
reddit.com:
“
I’m happy to grow up, but I won’t pretend fun things aren’t still fun out of fear of looking silly.”
Ok, I didn’t laugh hysterically at the comic. But I love the sentiment.
Raising my son in partnership with my wife is a lot of fun, and that involves a
serious amount of growing up.
Why should programming be any different? Business is
almost as serious as nurturing a life. It’s generally considered good form to enjoy parenting. Why shouldn’t we enjoy writing software the exact same way?
Sure, we don’t decide to rewrite an entire business application in
CPS on a whim because it seems like it might be fun to try. Just as we don’t teach our children to eat
Train Cake for every meal.
Let’s not pretend fun things aren’t still fun out of fear of looking silly. Programming is “Grown Up.”
And fun.
Labels: passion
HOF or OOP? Yes!
It’s interesting how the brain works: I wrote an essay that I thought was
an explanation of what first class functions and closures are, and why some people would like to see them added to the
Kitchen Sink of Languages.
Then I read
From Functional to Object-Oriented Programming, which suggests that my essay seems to be making a general statement about Functional Programming as an advance over Object Oriented Programming.
1Nothing could be further from my intention or my opinion. Smalltalk, for example, has a clean and powerful syntax for first-class functions. And those first-class functions are objects, as is
everything in Smalltalk. I am tempted to rewrite my essay using Smalltalk: it seems that using
Ruby, another language where everything is an object, is not making this point strongly enough. In my mind, it is easy to use languages that provide the OO paradigm as well as the first-class function paradigm. At the same time.
What I actually saidSo what I actually said is: “Here are these things called
first-class functions. This “thing” is where functions are exactly the same kind of thing as everything else in a language. This is useful, here is why.” If functions are first class, functions aren’t magically more special, nor are they second-class citizens in a language (like those brain-damaged primitives in crippled languages). If a language is an OO language, this implies that first-class functions in such a language are objects and that programming with them
is OO programming.
You’ll notice that I am not using the words “Functional Programming.” That’s because
I’m not talking about Functional Programming. I’m talking about closures and first-class functions. I imagine that these things are present in all contemporary Functional Programming Languages, but that has nothing to do with my essay about the value of having first-class functions in an OO language: Smalltalk has had them in an OO language
for nearly thirty years, and while it wasn’t the first OO language, it is the canonical definition of class-based OO.
That’s all I said. If you want to debate the merits of pure functional programming languages like
Haskell against OO languages like Ruby, you have to take that up with someone like
Tom or
Eric.
What I don’t understandI don’t understand the author’s objection to higher-order functional programming. He(?) says something about higher-order functions and type inference as being bad. I don’t get the argument. It seems like he is saying that the constructs are somehow too deeply nested, that it is too easy to make mistakes. I may not understand him correctly, for I have trouble seeing how this is different from having data structures with a thicket of HAS-A relationships.
Somehow we program just fine in business with complex data schema, and we manage to keep track of things just fine. Strong typing and compiler support helps. So does IDE support. So does drawing diagrams. How is “a function taking an integer and a function taking two integers, returning a function taking a list of integers and returning an integer” somehow more complex than “a record containing an integer and that has many records containing an integer”?
The argument seems to be (and I am open to correction, since what I think he’s saying is so obvious that I worry I’m misunderstanding it) that naming things makes all the difference, as in “a customer record with an ID has many Sales Records, each of which has an ID.” That does sound easier to understand.
But of course, nothing stops you from naming things if you have first-class functions in a type checked OO language, does it? I am not a practitioner, but I have been working my way through
The Little MLer
, and it is part of the ML programming paradigm to name first-class functional types for precisely the reasons the author seems to advocate.
Anonymity considered harmful?So perhaps the objection is to having too many anonymous things, to having too many values without names. There is certainly an appropriate balance to be sought. One language designer seems to dislike anonymous functions, to the point where his language won’t let you have any that won’t fit on a single line. He has an
opinionated language.
I have seen similar arguments with non-functions. That control flow branches should not be nested too deeply (you know,
case and
if and the evil ternary operator). I often use the “extract variable” refactoring to simplify expressions and make them more readable. Naming a part of an expression provides a certain level of abstraction that improves comprehension.
Types and expressions involving functions are no different, and if that’s what the author is saying I endorse his perspective.
That statement, however, is somewhat orthogonal to whether first-class functions should exist in a language. As long as you can name them, you have the tool to write comprehensible code. If your language has types, then you ought to be able to name types involving first-class functions. As long as you can name functional values and types, you can apply all the same style guidelines to first-class functions that you apply to record types.
In ConclusionFirst-class functions are a natural fit with OO, as evidenced by their presence in OO languages that aren’t glorified PDP-11 assemblers with some OO stuff bolted on the side by people with
very little OO and/or GUI programming experience. First-class functions can be used to write clean and legible code, using all of the same techniques we use for writing clean and legible code with records, objects, and other types.
From Functional to Object-Oriented Programming has some interesting points about FP and OOP and their historical context. This has nothing to do with my original post, but I think those words are worth reading and considering. The post raises some caveats about complex and anonymous first-class functions that are obviously sensible: we have noticed the same things with other forms of expression in programming.
- I will come clean here: when I think of OO, I am not thinking of C++ or Java, especially as they are practised in business. I’m thinking of languages like Self, Actor, and Smalltalk.
Labels: lispy