Can your type checking system do this?
I tend to work with dynamic languages. This is a personal preference at the moment. I have liked statically typed languages with powerful type checking systems in the past, and my mind is wide open to using them again in the future. Unfortunately, when you mention “static typing,” most people think you’re talking about Java. There is more to static typing than Java, a lot more. For example, Brian over at Enfranchised Mind
pointed out that languages like OCaml have a static type checking system powerful enough to enforce constraints like:
- All possible variants of a type must be dealt with—for example, that a list has no elements or at least one element must both be handled.
- The read_async function only takes file handles opened with the async flag set, while the close function can take file handles with the async flag set or not set.
- The buffer returned from read_async can not be accessed until the asynchronous read is completed.
- A Postgresql cursor can only be created and accessed with a transaction.
He’s absolutely right. Serious programming languages can do all of that.
Joel Spolsky once explained why he likes using Apps Hungarian naming conventions:
it helps avoid problems with various SQL and Javascript injection vulnerabilities. The idea is that you want to make it obvious when you’re violating certain constraints on the use of string values in your application. And Tom Moertel responded by showing
how Haskell’s static type checking can completely eliminate the problem. Once again, a powerful type checking system can be used to enforce an important constraint.
My own pet peeve problem is the dreaded
NullPointerException
. Why, I whinge, must the compiler demand that I type boilerplate like:
final Map<String, String> dictionary = new HashMap<String, String> () {{
// ...
}}
But it will not do a simple little thing for me in return like check all of my code paths to ensure that I never pass a parameter that
might be null to a method that assumes the parameter
can never be null? Wouldn’t it be
nice if the compiler would deign to help me with these things?
What do all of these things have in common? They are
semantic constraints. Constraints on the meaning of a program. The kind of type checking you find in popular programming languages that is based on interfaces, on the physical type of your values and variables, that kind of type checking is a step up from syntax, but not much further. I would call it grammar checking, checking that your “sentences” have the right parts of speech in the right places. There is no attempt to check that your sentences are meaningful.
the value of semantic validationPowerful type systems (type systems like those found in Ocaml and Haskell are often called “expressive” type systems), can be put to work enforcing a degree of semantic correctness. (There are other ways to do it: there are static code analysis tools, for example you can buy tools that check C/C++ code for memory leaks and buffer overflow vulnerabilities. And if you use a language with
Design by Contract features, you have another tool in your toolbox.)
The Little MLer introduces ML (and Ocaml) through a series of entertaining and straightforward exercises leading up to the construction of the Y Combinator.
ML and OCaml introduce powerful strong typing and type inference. Both are great languages to learn: you will stretch your understanding of defining types and writing correct programs.
There is an argument that testing can replace type checking. Quite honestly, I believe that extensive automated testing almost completely replaces the need for “grammar checking” in the software I write.
Why? Because checking that you don’t send the wrong message to an object, or call a function with the wrong number of arguments is something that seems to get found when my test suite reaches a certain tiiping point in code path coverage. So far I have always found these things very quickly.
That being said,
Theory P warns me that there are some I haven’t found, of course, but I have some faith that the result of such an error will not be catastrophic. With strongly typed languages like Lisp, this kind of failure is gracefully handled. We aren’t talking about a buffer overflow that can turn exploited into a privilege escalation.
The nice thing about grammar errors is that you don’t have to try very hard to catch them. If you have reasonable code coverage, you’ll find them. So you concentrate on your test cases, and while you are testing your logic you will validate your grammar.
So that’s why dynamic languages in conjunction with automated test suites are working
for me. YMMV, so let’s not argue about whether it would work for you or not, because I want to move ahead and say that I don’t believe that extensive automated testing has quite the same value for semantic checking.
I believe the hard part of building software to be the specification, design, and testing of this conceptual construct, not the labor of representing it and testing the fidelity of the representation. We still make syntax errors, to be sure; but they are fuzz compared with the conceptual errors in most systems.
Semantic errors, you see, are not covered “for free” while you exercise your program logic. If you want to rely on testing to find them, you must become very good at thinking up edge cases where things can go wrong. You must try situations where things fail and you get
nulls
. You must try to create race conditions in concurrent code. You must try really bad data, like setting up cases where an employee’s hire date is after their termination date, or there is a cycle in the graph of reports and managers, or someone has
eight bosses.
You have to try empty lists, lists with one item, all of the cases. Grammar checkers only check that you are passing a list, you need semantic checking to ensure you have coded for every case.
what to doNow, you do not get semantic checking for free if you start writing code in Ocaml or Haskell. Why? Because you will be a Real Programmer, someone who can write Fortran in any language.
A language that doesn’t affect the way you think about programming, is not worth knowing.
To take advantage of a language like Ocaml, you must actually think through the semantics of your program and think of ways to make the type system work for you. You must “think in types” instead of thinking in values. This is a valuable exercise, because it fundamentally changes the way you view programming an application.
In my experience, this thinking through the semantic constraints of your application may be the most valuable result of the exercise. You see this when people write a very database-centric application. They think about the entities and the relationships. They write constraints and triggers. There’s a great value in arguing all these cases out, in drawing the arrows on the white board and marking them with little question marks, plus signs, asterixes, and ones. It makes you think things through.
I was once asked in an interview to name a feature from C++ I wanted to use in Java. I didn’t name Templates (this was before generics), or Multiple Inheritance. I named the
const
keyword. I am a huge fan of
const correctness, of using the C++ compiler to enforce this one little
semantic constraint. It’s an enormous burden, because you literally have to go through your entire program, usage by usage, to make sure you are fully const correct and have correctly declared everything. It would be far easier if C++ had const inference.
But const correctness is valuable precisely because it forces you to think very hard about a
semantic constraint: when should something change, and when should it be considered immutable? The act of bringing a code base into const correctness forces you to think about what the code
does to its data, not just whether it has “spelling errors.”
I can see the same value in using a language like Ocaml or Haskell to enforce semantic constraints. Sure, the compiler will catch errors for you. But the act of designing your application’s types to enforce your application’s constraints will be a powerful force driving you to deeper understanding of your domain.
If your current programming language can’t do this, I encourage you to study ML, Ocaml, Haskell, or one of the other fine languages with powerful typing systems.
Update (April, 2008): 7 Actually Useful Things You Didn’t Know Static Typing Could Do: An Introduction for the Dynamic Language Enthusiast