If Sneetches with Stars use Java, and Sneetches without Stars use Ruby, who uses ML?
ML is a programming language featuring
type inference: you don’t have to encumber your code with type declarations, the compiler can figure them out for you. So… are type inference languages like ML for Sneetches
with or without stars? Or another kind of Sneetch entirely?
Now, the Star-Belly Sneetches had bellies with stars.
The Plain-Belly Sneetches had none upon thars.
Those stars weren’t so big. They were really so small
You might think such a thing wouldn’t matter at all.
Update: More than a few people have written that Steve Yegge's association of static typing with neatness and dynamic typing with slovenliness runs opposite to their impressions of the kinds of people who strongly prefer one or the other. I used Steve's terms in the original post, partly because I thought people would get the same joke I thought Steve was making. It looks like they don't, nobody wrote to say "LOL." I have changed the terms to something that represents what I think of the cultural divide between programmers who like Java and programmers who like Ruby.Let’s review. Sneetches with stars like to use a colour-coded label maker to label the drawers, boxes, and files in their office. Once glance at everything and you know what it holds. Sneetches with stars add extra labels even when you don’t need them. For example, if a box is labeled ‘tax receipts’, each piece of paper inside has a post-it note saying tax receipt’, even if it’s obviously a tax receipt and lives inside the tax receipts box.
What is Stariness?
Sneetches with stars like these languages we say are
statically typed. What do we mean by the word static? We mean
it can be resolved at compile time. Other words for this idea are
invariant or
constant. Sneetches with stars like languages where the type of each entity can be resolved at compile time.
Some people are always critical of vague statements. I tend rather to be critical of precise statements; they are the only ones which can correctly be labelled "wrong."
Raymond Smullyan
Let’s dive into this a little deeper. (My apologies to my readers who were actually paying attention to the stuff in first year computer science that isn’t a requirement for
getting a job at BigCo.) What does it mean when we say “something can be resolved at compile time”? That expression is laden with implementation details like assuming we’re using a compiler. But it’s a convenient short-hand for saying
something about the program that is true every time you run the program.
Consider the
final
declaration in Java. If you write:
final String snafu = "situation normal...";
We know that the variable
snafu
always holds a reference to the constant string
"situation normal..."
. No matter what data you feed to your program and how you mangle it,
snafu
will always be
"situation normal..."
. Do you agree? (Joe Campbell, put your hand down. Yes, there is a back door way you can change the contents of a
String
in Java.)
Java can take advantage of this to perform
constant propagation. Everywhere you write
snafu
, Java can substitute
"situation normal..."
and throw away the variable lookup. To get away from arguing about back doors in the
String
class, let’s consider one of the primitive types, a
boolean
. If you write:
final boolean foo = true;
// code without assignments to foo
if (foo) {
// do something
}
else {
// do something else
}
Wouldn’t you agree that the compiler can get rid of the variable lookup and the
if
statement? The path through the code is always through the
// do something
path every time you run the program.
Now back to the word
stariness. We really mean
the amount of stuff about the program that can be resolved at compile time, or if you prefer,
the amount of stuff that is true every time you run the program.
In the example above, the compiler can figure out which branch the program will follow at compile time, because that variable is true every time you run the program.
Stuff that is always true is useful. For most programs, we have an idea in our head about “correctness.” What we mean when we talk about a program being correct is that it produces desirable results every time you run the program.
A formalist is one who cannot understand a theory unless it is meaningless.
Stariness is thus similar to correctness. And that’s why a lot of people, the Sneetches with stars, are obsessed with it. Being able to “prove” something about their program (“the method call
foo.bar(5)
never throws a
MethodNotImplemented
exception”) feels a lot like being able to prove that their program is correct.
It feels a lot like it, but it isn’t the same thing. The reason it isn’t the same thing is that while its true that a program throwing
MethodNotImplemented
exceptions is probably not correct, it’s not true that a program that doesn’t throw such exceptions
is correct. It just feels, somehow, more likely to be correct because we’ve thrown out one of the infinite ways it can be incorrect.
Now that we’ve dispatched that logically, let’s be clear about something: just because stariness does not enforce correctness, it doesn’t mean that stariness isn’t
useful. Stariness is useful. Period, no debate.
Back to inferences
Type inference is also for Sneetches with stars. A language with type inference resolves the type of each entity at compile time by inspecting the program and figuring the types out through inspection. It’s a lot like the way a compiler can look at the Java code above and figure out that you always
// do something
and you never
// do something else
. The code looks sorta like you could go either way, but the compiler knows better.
Languages with type inference look like variables can have any type, but the compiler knows better. Remember the labels that the verbose declaration Sneetches with stars love? Type inference languages still have labels, but the labels are hidden inside of the files and boxes where you can’t see them.
Remember when manufacturers used to put their labels
inside clothes instead of right across the front? Same thing. The rules for what goes where are strictly enforced, it’s just that if you can figure out what goes where with a bit of common sense, you don’t need a label or a post-it note.
Compare these two snippets of Java:
final String[] words = { "foo", "bar", "blitz" };
final int word_length = words.length;
final String[] anagrams = new String[word_length];
…and…
final words = { "foo", "bar", "blitz" };
final word_length = words.length;
final anagrams = new String[word_length];
Hey, if a variable is final, we can figure out its type in Java through simple inspection. Making that work in the compiler is something an intern ought to be able to do over a Summer work term!
(Frank Atanassow pointed out that techniques exist for inferring the types of nearly all Java variables through inspection of programs. But this simple case is enough for our purposes.)So if we take a valid Java program and simply erased type declarations whenever we could logically deduce the type of the variables (using our simple scheme), but left them in whenever we were not sure of the final type of the variables, we would have exactly the same program. Nothing about it has changed except it has fewer symbols. It’s just as starry, it is just as static, it is no more or less correct than it was before we erased some symbols.
And you over there itching to say something about IDE refactorings and auto-completions: None of those go away either. You can rename things and move things and press command-tab to get an object’s methods whenever you like. So… would you agree that type inference of this sort doesn’t change a starry program into a starless program? This isn’t about stariness versus starlessness, it’s about the obsessive-compulsive desire to label everything.
The bottom line:
type inference does not change a statically typed language into a dynamically typed language. It’s still starry.
So why can’t the Sneetches without stars use type inference?
Think of types as being like values and objects like variables. A statically typed language is one where there are no type re-assignments. Some languages enforce this. But if you write a program in a static way, you can still reason about it. This is why lots of people think that we can “neaten up” languages like Ruby by adding type inference to the compiler: they're thinking about programs that are neat to begin with, but we happen to have written them in a language for Sneetches without stars.
And whenever someone talks about a refactoring IDE or an auto-completing IDE for a dynamic language, they’re talking about performing some type inference on Ruby programs that are written in a static way. So… what’s the holdup? We said we could add type inference to Java in a Summer. Where’s the intern to add it to Ruby?
Programmed. In me somewhere, he thought, there is a matrix fitted in place, a grid screen that cuts me off from certain thoughts, certain actions. And forces me into others. I am not free. I never was, but now I know that; that makes it different.
Philip K. Dick, "The Electric Ant"
The problem is that the set of all programs that are "starry" is a subset of the set of all programs that parse correctly. So either not all starless programs are neat, or not all portions of a starless program are neat, or both.
Let’s compare back to our Java snippet. Remember:
final boolean foo = true;
// code without assignments to foo
if (foo) {
// do something
}
else {
// do something else
}
The compiler could infer that we always follow the first branch because it knows that final variables are not reassigned. They’re
immutable. What happens if we erase the
final
keyword as well:
boolean foo = true;
// code that might have assignments to foo
if (foo) {
// do something
}
else {
// do something else
}
Now the job is much harder. We have to examine all the code in between the declaration and the use of foo. If there are any assignments involving things we can't know until runtime, we can't know the value of foo until runtime.
For a very large class of programs, we cannot infer the contents of a variable with less runtime complexity than running the program for every possible input. This is why compilers have limitations on the optimizations they can perform, and humans still need to do some thinking about writing fast programs.
This exact same thing happens with types. In statically typed languages, types are never re-assigned. Whether explicitly declared or inferred, they're immutable. But in languages like Ruby where methods can be added and removed dynamically, where messages can be forwarded dynamically, where we can even send messages dynamically, the types of objects are fully mutable.
In starless languges, there is no
final
keyword on the types of objects. We can no longer infer the type of a variable in any but the simplest, degenerate cases.
The type inference problem in dynamically typed languages is exactly the same as the inferring the possible contents of a variable problem. The inferring the contents of a variable problem is doable for a restricted set of programs. And the way we tell the compiler that a variable is a member of this restricted set is with the
final
keyword.
Likewise, the way we tell a compiler that the type of a variable is also restricted is that we use a language where the type of every variable is final. It’s the same thing: we don’t reassign final variables and we don’t change types on the fly.
Starlessness is not about writing programs without labels. Starlessness is when you write dynamic programs. Dynamic doesn’t mean ‘unlabeled’. As I showed above, if the
final
keyword is there, the label is mostly optional. But if you don’t have
final
, you’re writing dynamic programs.
Truly starless programs have dynamic types:
types that change at run time. they are not always one thing or another. For example, what if you write an Object-Relational Mapper (“ORM”) that reflects on the database structure at run time. That is, you can change columns in a database table and you get new getters and setter methods in your program.
Without recompiling.
In a fully static language (with or without type inference), you can’t do that. Think of Java’s JDBC: you have to fool around with methods that get values and pass a column name as a parameter. Or maybe you create a hash. And C# is getting this capability, but of you look closely you still have to define the “type” of a query through the LINQ syntax.
Are Sneetches with stars ever starless?
A dynamically typed language lets us define an object holding a database row with methods for each column. But we can’t know at compile time whether our program will throw a
MethodNotImplemented
exception because we don’t know whether someone will monkey with the database structure. That sounds bad.
But what happens if you write the same thing in a neat program? Aha! a
SQLException
! it seems that there are dynamic things that must be dynamic no matter what you do.
This is a specific case of Greenspunning. There are some facilities of dynamic languages that you are going to need. If you don’t have them built into your static language, you will build them yourself or use a framework that has already built them for you. Other examples I have seen in Java programs include:
Spring and Hibernate;
Any use of
Class.forName(...)
;
Any use of dynamic proxies;
In essence, you’re being a Sneetch without a star but twisting your starry language to permit starlessness. And for those portions of the program that are no longer nice, starry bundles that can be examined at compile time for invariant behaviour, you are indeed in dynamic territory and have to live with the risks.
In my experience, all non-trivial starry programs contain this kind of starlessness. To my admittedly inexperienced eyes, starlessness is the hallmark of expert programming in starry languages ("expert" does not necessarily mean "more desirable," especially in the minds of those who believe that programs should be written and maintained by mediocre developers).
Eating cake
So… can we say that since you can write starless programs in neat languages, you can have the useful benefits of stariness when you need it and the flexibility of starlessness when you need that too? Isn’t that better?
Yes, you
can say that. And you may be right, for you. The
Boo people believe that: their language has a
duck
keyword for when you feel like a Sneetch without a star. Be aware that at this moment in history, languages designed for Sneetches without stars seem to have much better features for writing starless programs than languages for Sneetches with stars. So my observation is this:
If you dislike the verbosity of starry languages like Java but like the feeling of safety, try a type inference language. Don’t go to a starless language if you don’t intend to actually write dynamically typed programs.My experience is that if you are frustrated by the amount of work you have to do to express the algorithms in your head, you should look at a language that removes your frustration. If you're using Java and don't like the verbosity, find a language that celebrates brevity while preserving static typing. But if you're using Java and find yourself pushing more and more logic out of java because its type system is too static or too inflexible, you should consider a language with a different approach to typing.
Computer languages differ not so much in what they make possible, but in what they make easy.
Larry Wall
Why would the Sneetches without stars use starless languages?
Writing starless programs on top of neat languages is exactly the same thing as writing automatic memory management routines on top of a manually managed programming language or writing functional programs on top of a noun-centric object-oriented language.You can take that statement as an argument in favour of specialized languages for Sneetches without stars or as an argument against them. My guess is that the above statement is true and a Rorschach Inkblot: You will interpret it as confirmation of your existing prejudices.
Labels: java, popular, ruby