raganwald
(This is a snapshot of my old weblog. New posts and selected republished essays can be found at raganwald.com.)

Friday, December 07, 2007
  Amortized Costs, DOS, and Magicicadas


Jeff Atwood has just posted an interesting article, Hashtables, Pigeonholes, and Birthdays. Hash Tables are a very interesting subject, you can take almost any aspect of them and follow it down interesting trails.

For example:

Performance: Quantifying the performance of a data structure like a Hash Table is interesting. There are measures like Amortized Cost as well as Worst Case Cost. This subject is one of the core themes of the book Purely Functional Data Structures. (You can also download the author’s thesis on the same subject) I recommend it whether you have an interest in “purely functional” programming or not: “purely functional” data structures provide all of the imperative and side-effect oriented operations you know and love—like insertions and removals—while also providing the foundations for highly concurrent environments like threads with shared memory.

Security: One-way functions are strongly related to hash functions. They are the foundation for many of today’s security and secrecy protocols. For example, fingerprinting is used to allow mirror sites to host downloads while giving the public comfort that the downloads have not been compromised with Malware. However, if the one-way functions underlying them can be cracked—as MD5 was recently—miscreants can substitute bad files for good at will.

Even when security does not appear to be a factor, compromising a hash function can have a security impact. As one person mentioned in one of Jeff’s comments, if the hash function for a programming language is public, attackers can deliberately engineer worst-case performance to create a Denial Of Service attack. Here’s a deliberately trivialized example:

Consider a public-facing site with free user registration. When you log in, the site performs a look up on the user name. Now obviously there’s a database somewhere. But perhaps the programmers put a cache in front of it. You could imagine such a cache being useful if cookies are used to keep someone logged in. If the cache uses a hash table, a vandal can create lots and lots of user names that hash into the same bucket. Anyone with a user name in the same bucket will face a long wait as the system performs a serial look up within the bucket.

Whimsey: One of the comments mentioned that the number of buckets is usually a prime number. Did you know that insects figured this strategy out on their own?



Editorial/Love Fest:

And now for what I really wanted to say.

People have pointed out that Jeff often seems to be summarizing well-known computer science principles. A Reader’s Digest for programmers, as it were. The suggestion seems to be that this is not useful, since anybody with a degree or even a passing interest in programming ought to know these things. My own feeling is that Jeff’s blog is far more useful than my own.

Coding Horror is published out in the real world. It is written for real people.

This weblog operates firmly in Lake Wobegon. It’s true! The very fact that you are reading about programming means that you are exceptional for reading it and I am exceptional for writing it. I am not going to start a flame war and say you and I are better programmers. Who really knows? But I am very confident when I say that we are not representative of the typical programmer.

The typical programmer does not know what a closure is and does not care. If they are added to Java he or she will not use them any ways. The typical programmer may have obtained their degree, but they were far more interested in memorizing whatever was needed to get a B on the examination then they were in the subject matter.

They may have a certification, but they do not own and have not read books outside of their immediate requirements, so Knuth is foreign to them. While they may talk of refactoring, they have never heard of Martin Fowler or Joshua Kerievsky.

When Jeff writes a Digg-able post about multiple monitors, or how to optimize memory on a Windows PC, he get readers that would never subscribe to a blog if they were told it was about programming. And I am talking about programmers. Try to wrap your head around the idea of programmers who do not find programming blogs interesting (in fact, they don’t find programming interesting at the moment).

If they are then exposed to something insightful and interesting about hash tables… Many will read it and forget it, but a few will be interested enough to delve deeper into the whys behind the whats.

I am all for that. It is remarkably easy to turn and preach to the choir. Evangelizing to the bored and uninterested… This is a frustrating and difficult job. I wish I had the zeal to reach out across the divide and bring people over to where programming is deeply fulfilling and fascinating. I’m glad Jeff is doing the job he’s doing and wish his blog every success.
 

Comments on “Amortized Costs, DOS, and Magicicadas:
I was thinking along similar lines when I read his post. I don't think there was anything new for me in that particular article (other than the numbers for the birthday effect.. I knew of it but couldn't remember any percentages) but even if something isn't new it's good to be reminded.

While the post itself didn't teach me anything new, discussion of it surely will. Already I've learned something from its discussion. I'd never considered using hash collisions for a DDoS and because that was mentioned I'm going to be more careful when using hashes anywhere a random user can be faced with it.

Oh and Jeff is coming to Montreal for CUSEC in January. Looking at the speaker list it seems CUSEC is going to be amazing once again.
 
Reg, just out of curiosity, where do you get your notions about what the "typical" programmer is like?

I keep seeing these statements from bloggers to the effect of: "The very fact that you're reading a programming blog means that you're exceptional."

Can you make statements like this because you've personally met what you believe is a representational cross section of programmers, and you know that 90% of them did not read blogs related to programming?

How would you back up your claims about what a "typical" or "average" programmer is like?
 
Matt:

What do you think?
 
Not trying to turn an anecdote into a statistic, but Coding Horror was the first programming blog I got hooked on (about two years ago), after several years as a programmer-sans-clue. Because of how approachable and interesting it usually is, I now read a handful (including this one, obviously), and often delve deeper into topics unfamiliar to me. Actually, I just looked up closures and lambdas yesterday - it bugs me not to have any idea what everyone is talking about. Coding Horror is a great gateway drug for programmers aspiring to broaden their horizons.
 
I find coding horror to be very interesting and useful. I'm a young programmer just out of school (I graduated about 6 months ago). Usually I read it and yes, I know these things - but they reside in the back of my mind. It's easy to say you know something when you read it again, but do you remember that same information when you're writing code and the blog post isn't in front of you? That's why I find coding horror helpful - it repeats the knowledge I should know, so that I internalize it and remember it when I could really use it. I think everyone could use a reminder from time to time, it certainly doesn't hurt.
 
I think a simple thought experiment should show that what CodingHorror does is not easy and valuable.

If it were so easy and not of value, every technical blogger interested in having a large audience would have as many subscribers as he does.

He's obviously doing something others are not. :)

As for the notion of the "typical programmer", I've observed the same thing you have. I have tons of anecdotal evidence from working in the field.

What we need are software sociologists to start generating stats on this stuff. A real study would be very illuminating.

So I believe and agree with this notion based on my experience, but I don't have hard stats.

But at the same time, nobody has hard stats to say it is false.
 
Totally true, I'm a designer hooked on coding horror, I don't even program but Jeff has the elusive talent to make subjects as "dry" as hash tables seem interesting anybody.
 
I know this isn't conclusive or even scientific, but my own experiences verify the "typical"/"non-typical" programmer divide. I'm a consultant and whenever I go to work at a new customer site I ask the people I work with (both colleagues and customer staff) two questions;

1. Which IT, programming or management books have you read in the last twelve months?
2. Which, if any, technology related blogs to you regularly read?

Adding up the number of people who have answered yes to either I get the same 20/80 split that is mentioned here and at coding horror.
 
I will just say this, what's more impressive than the fact that the cicadas emerge in 'prime number' yearly cycles is the fact that they figured out and numbered themselves in Roman numerals!
 
CodingHorror, Spolsky, Ragenwald, Broooks, even McConnel's books. There all a manifestation of something unusual, that really ought to bring out the sociologists:

Programmers write more about programming than writers write about writing. And more than mathemeticians write about doing math.

Programmers, coders, developers, computer scientists, and hackers write more about their own professional mental and behavioral work processes than any other profession, probably including psychology and psychiatry.

What is it about us or our algorithms or data structures that makes us wax logorrhoeac?

There is good news, according to our oracle, Wikipedia: "Several of the possible causes of logorrhoea respond well to medication".

Is our wordiness caused by the unusual semantic requirements of our work? Does the essayist community naturally drift into coding? Or is it merely an increased access to keyboards?

Whatever your answer, please be as brief as possible, but no briefer!
 
I agree!

One of CodingHorror's strengths is that it's so easily approachable.

It deals with many everyday things, but offers a lot of insight on them.
 
All in all, it takes at least 10 years for anyone to master their field, and this includes Mozart, Einstein, Bobby Fisher, etc.

Reaching back to the roots of your training is the best to solidify neural pathways and build out new relationships to concepts. What CodingHorror is the same as what chess master and TaiChi master Josh Waitzkin does to acquire new talent and abilities. My post has a links to an article from Scientific American that discusses the learning processes of chess grandmasters. It is very uncanny how that process is similar to what has been discussed here. Enjoy.

The Art of Learning
 
Reg,

To be honest, I've only been out of college for about 2.5 years. So I really don't feel like I could speak intelligently about what the average programmer is like.

Even if I had more experience in the field, isn't there likely some sort of sampling bias present here?

Wouldn't "exceptional" programmers, like those who read blogs, naturally tend to find their way to companies that have a higher than normal density of exceptional programmers? How could you really find "clean" data in that case?

One other thing of interest: I just took my first job as a consultant a few months ago. As one of the other commenters here noted, this might afford me some interesting opportunities to see what programmers are like out there in many different real world environments, and might make for some nice data-gathering opportunities.
 
"What is it about us or our algorithms or data structures that makes us wax logorrhoeac?"

Programming is at the same time complex and done by very many people, and we don't even really understand what we are doing.
Trying to understand produce words.
 
>> Try to wrap your head around the idea of programmers who do not find programming blogs interesting (in fact, they don’t find programming interesting at the moment).

The idea that people might be involved in this insane, frustrating, and misunderstood craft *without actually being interested in it* fills me with a deep depression.

I can imagine that people go into 'the professions' (say law or medicine) for the money, without any actual vocation. But surely no one would do that with programming? Please?

(Actually, I know this is true - I have encountered people in the past for whom this is 'just a job'. I'm just trying to convince myself that in fact I imagined them...)
 
Larry:
I wish that you just imagined them. The sad truth is that there are many people who do get into our field purely for money. The sad thing is that they're allowed to. There are not enough sufficiently hard courses that weed out the less dedicated ones available on the university/college level. Same could be said of the HR people who are not able to distinguish wheat from chaff, unless the wheat is a class of its own.
 
re: Srdjan

I don't see why that's necessarily a sad thing. You don't see plumbers spending their entire weekend and evenings reading up on PVC tubing.

The fact is that the majority of programming tasks aren't actually that difficult. They don't require a superstar in order to get done and, in a lot of cases, a superstar wouldn't do them anyway. A professional programmer who takes his work seriously but considers it a job and nothing more is exactly what you need most cases.
 
Guillaume,

Yes, but I assume that by saying "professional programmer", you assume that person to be a competent one as well. The problem is that most people in our industry who got here for money and not love often tend to find the shortest way to a "working" solution. Not necessarily the best way, not even something that's acceptable by someone who loves the field of programming.

I'm pretty sure you already know of it but, in case you don't, go to The Daily WTF. Sometimes, there are wonderful stories of shortcuts taken and the mess that results.

I guess, my point is that people like that need to be systematically weeded out, whether through education or through better interviewing skills. Steve Yegge has written about it extensively.
 
Yes you're right. I should have defined professional before using it.

Professional to me means a competent worker who can define proper goals, has the means to achieve them and puts the achievement of those goals above inconsequential nonsense like language religion wars and such.

People who look for the shortest solution (usually a copy & paste solution) to something that sort of seems like it works I don't call professionals, I call them hacks.

And Hacker != Hack.

To me, a hacker is someone who loves to understand how things work and loves to create. Definitely not a hack.

I really should have clarified all this from the start, sorry.
 




<< Home
Reg Braithwaite


Recent Writing
Homoiconic Technical Writing / raganwald.posterous.com

Books
What I‘ve Learned From Failure / Kestrels, Quirky Birds, and Hopeless Egocentricity

Share
rewrite_rails / andand / unfold.rb / string_to_proc.rb / dsl_and_let.rb / comprehension.rb / lazy_lists.rb

Beauty
IS-STRICTLY-EQUIVALENT-TO-A / Spaghetti-Western Coding / Golf is a good program spoiled / Programming conventions as signals / Not all functions should be object methods

The Not So Big Software Design / Writing programs for people to read / Why Why Functional Programming Matters Matters / But Y would I want to do a thing like this?

Work
The single most important thing you must do to improve your programming career / The Naïve Approach to Hiring People / No Disrespect / Take control of your interview / Three tips for getting a job through a recruiter / My favourite interview question

Management
Exception Handling in Software Development / What if powerful languages and idioms only work for small teams? / Bricks / Which theory fits the evidence? / Still failing, still learning / What I’ve learned from failure

Notation
The unary ampersand in Ruby / (1..100).inject(&:+) / The challenge of teaching yourself a programming language / The significance of the meta-circular interpreter / Block-Structured Javascript / Haskell, Ruby and Infinity / Closures and Higher-Order Functions

Opinion
Why Apple is more expensive than Amazon / Why we are the biggest obstacles to our own growth / Is software the documentation of business process mistakes? / We have lost control of the apparatus / What I’ve Learned From Sales I, II, III

Whimsey
The Narcissism of Small Code Differences / Billy Martin’s Technique for Managing his Manager / Three stories about The Tao / Programming Language Stories / Why You Need a Degree to Work For BigCo

History
06/04 / 07/04 / 08/04 / 09/04 / 10/04 / 11/04 / 12/04 / 01/05 / 02/05 / 03/05 / 04/05 / 06/05 / 07/05 / 08/05 / 09/05 / 10/05 / 11/05 / 01/06 / 02/06 / 03/06 / 04/06 / 05/06 / 06/06 / 07/06 / 08/06 / 09/06 / 10/06 / 11/06 / 12/06 / 01/07 / 02/07 / 03/07 / 04/07 / 05/07 / 06/07 / 07/07 / 08/07 / 09/07 / 10/07 / 11/07 / 12/07 / 01/08 / 02/08 / 03/08 / 04/08 / 05/08 / 06/08 / 07/08 /