raganwald
(This is a snapshot of my old weblog. New posts and selected republished essays can be found at raganwald.com.)

Tuesday, August 22, 2006
  Anything "Ridiculously Easy" is going to attract some Ridicule


Recently, Lucas Carlson announced Starfish, an ultra-lightweight distributed processing framework, and map_reduce, his own implementation of distributed mapping and reducing based loosely on Google's famous MapReduce.

Almost immediately, people pointed out that what he had created looked to them like a toy. Some people pointed out that although its map is fully distributed, its reduce is centralized in the supervisor process. Others pointed out that fault tolerance was not built-in. Some even pointed out that it looked like a thin wrapper around other services (as if free software is sold by the pound).

we have some stripey grass

I agree that map_reduce is not MapReduce. That's a good thing. After all, the world already has MapReduce. If you want to use MapReduce just the way it is, go work for Google. Don't wait.

You know, all the comparison to MapReduce has a strangely familiar ring. What do the following systems have in common?
Minicomputers, microcomputers, personal computers, laptop computers, 5 1/4" disc drives, 3 1/2" disc drives, 1.8" disc drives, client-server computing, PC databases, Unix, C, Ruby, Java, television, colour television, automatic transmissions, iPods, ...
Must I go on? History is replete with inventions that are simplified, scaled-down versions of things that have come before (I know, Java-haters will find it hard to remember a time when Java was the new kid on the block that represented a simplified, scaled-down language compared to C++). It seems that every time some such thing comes out, somebody points out that it is a toy, not suitable for serious use.
My personal favourite example of dismissing advancement is the television. When it came out, old time radio people dismissed it as a toy. Nice call. But wait, we aren't done. When colour television came out, the black and white television people dismissed it as unneeded. Somebody, as they say, wasn't dancing with the innovation they brought to the ball.
And what happens? Come on, you know where this is going, it's practically a cliché: first the new, simpler, less powerful thing lives in a weird niche where people have a special need that overrides the bountiful impracticalities of the new thing. Then whole new markets are discovered where the new thing offers the perfect balance of features and before long, the new thing takes over the old markets.

It's pretty obvious to me that when a lot of people dismiss something as being too simple, too underpowered, and lacking the wide variety of features and options of its predecessors, the right thing to do is to take a closer look and suspend final judgment. Right now there's a world market for maybe five full-text web search engines. If you are one of those five people trying to index the entire web, you can dismiss map_reduce immediately.

Everyone else might want to look at map_reduce (and everything else considered too wimpy for serious work) and instead of listing all the ways it falls short of the status quo, ask yourself in what ways does the status quo falls short of mass-market appeal.

At first glance, map_reduce looks like it makes it really easy to distribute analysis, especially of things living in your database. Hmmm. Thousands of Rails users put things in one database. Will it scale to 2,000 systems? How many of you have 2,000 systems? Next question.

Now how many Rails programmers just ordered a shiny new Mac Pro with four cores? Nice to see a sea of hands. Guess what? You are all people who could benefit from map_reduce right now. Do you have a few spare Macs or PCs in your office? All the better, put them to work while you're at it.

I'm not in a position to recommend using map_reduce until I've tried it myself. But I can say without hesitation that there is a need for ridiculously easy distributed processing on Ruby, and it doesn't need to scale to 2,000 machines to be useful.

Update: Lucas posted a working example from a production system: How I sent emails 10x faster than before. Updated again to link to his explanation for how reduce works in map_reduce.

Hot news!

"How many of you have 2,000 systems?" The answer is: all of you. Amazon's Elastic Compute Cloud lets you run applications on thousands of machines and pay only for compute time and bandwidth outside of the cloud. Note to my sharp-witted readers (again, all of you): this is not a license to write and say "because I might run an application on 2,000 servers, I'm dismissing Starfish without another thought." The correct thing to write is "I have written an application that runs on 2,000 servers, and..."

Labels:

 

Comments on “Anything "Ridiculously Easy" is going to attract some Ridicule:
It deserves the ridicule because its got the map, but no reduce. Ruby's got dRb which is already dead simple and Rinda, built on dRb, which is similarly dead simple. All Lucas did was mislabel his rather minor achievement. Ruby may need "ridiculously easy" distributed processing, but unless you've been using Erlang for the past couple years, everyone else is in the same boat. And you only get half of the show with Starfish, anyway.
 
Chuckle.

I just realized that "anonymous" sucked me into his/her world, where the issue is whether map_reduce sucks or not.

I should re-read my own blog post before replying to any more comments.

The question this post raises is whether there is value to be harvested from map_reduce. I'm not trying to argue "map_reduce is great, you don't understand how great."

I'm arguing even if map_reduce is ridiculous, it is still valuable. It may be valuable precisely because it only provides "half the show."

So anonymous, my reply to you should have been "yes, and that's why you should take another look."
 
I totally agree with Reg here, 100%.

I also have a strong feeling that Mr. Anonymous is someone in the ruby community who hates that I am getting attention for starfish because he thinks he can do 100x better but has never spent the time nor had the guts to actually write an open-source library... it can be scary to publish code and let it out in the open, ready to be ridiculed like he does here... in fact, he probably hasn't released a library because he is scared of people just like him.

For everyone else, don't worry, this ridicule won't stop me, I have a lot of interesting and easy to use features planned for Starfish coming up.
 
its got the map, but no reduce

When you wrote this, you may not have been aware that map_reduce does have reduce. To my knowledge, it was not included in the rdoc for Starfish 1.1.

Lucas has now posted an example, and I've updated the post with a link. Enjoy!
 
Good point.

The comparison that sprang to my mind is MySQL. Once dismissed as a toy compared to "real" RDBMSs like Oracle etc. Now runs a significant proportion of the world's web-sites.
 




<< Home
Reg Braithwaite


Recent Writing
Homoiconic Technical Writing / raganwald.posterous.com

Books
What I‘ve Learned From Failure / Kestrels, Quirky Birds, and Hopeless Egocentricity

Share
rewrite_rails / andand / unfold.rb / string_to_proc.rb / dsl_and_let.rb / comprehension.rb / lazy_lists.rb

Beauty
IS-STRICTLY-EQUIVALENT-TO-A / Spaghetti-Western Coding / Golf is a good program spoiled / Programming conventions as signals / Not all functions should be object methods

The Not So Big Software Design / Writing programs for people to read / Why Why Functional Programming Matters Matters / But Y would I want to do a thing like this?

Work
The single most important thing you must do to improve your programming career / The Naïve Approach to Hiring People / No Disrespect / Take control of your interview / Three tips for getting a job through a recruiter / My favourite interview question

Management
Exception Handling in Software Development / What if powerful languages and idioms only work for small teams? / Bricks / Which theory fits the evidence? / Still failing, still learning / What I’ve learned from failure

Notation
The unary ampersand in Ruby / (1..100).inject(&:+) / The challenge of teaching yourself a programming language / The significance of the meta-circular interpreter / Block-Structured Javascript / Haskell, Ruby and Infinity / Closures and Higher-Order Functions

Opinion
Why Apple is more expensive than Amazon / Why we are the biggest obstacles to our own growth / Is software the documentation of business process mistakes? / We have lost control of the apparatus / What I’ve Learned From Sales I, II, III

Whimsey
The Narcissism of Small Code Differences / Billy Martin’s Technique for Managing his Manager / Three stories about The Tao / Programming Language Stories / Why You Need a Degree to Work For BigCo

History
06/04 / 07/04 / 08/04 / 09/04 / 10/04 / 11/04 / 12/04 / 01/05 / 02/05 / 03/05 / 04/05 / 06/05 / 07/05 / 08/05 / 09/05 / 10/05 / 11/05 / 01/06 / 02/06 / 03/06 / 04/06 / 05/06 / 06/06 / 07/06 / 08/06 / 09/06 / 10/06 / 11/06 / 12/06 / 01/07 / 02/07 / 03/07 / 04/07 / 05/07 / 06/07 / 07/07 / 08/07 / 09/07 / 10/07 / 11/07 / 12/07 / 01/08 / 02/08 / 03/08 / 04/08 / 05/08 / 06/08 / 07/08 /