Let's Make This Personal, Please

(This is a snapshot of my old weblog. New posts and selected republished essays can be found at raganwald.com.)

Tuesday, February 26, 2008

The whole Open Classes thing seems to be turning into another debate along the lines of static vs. dynamic typing or perhaps GOTO vs. Structured Programming. I’m all for a lusty debate about programming languages, it’s a definite interest of mine. But I’m going put out an appeal right now: this time around can we try something different?

In the past, we (as in “we the blog echo chamber” or perhaps “we the programming.reddit.com flame wardens” or even “we the news.ycombinator.com mutual admiration society”) have fallen into the nasty habit of “negotiating for someone else.” Do you know what that is? It’s when you say, “Well, that’s fine with me, I can program Emacs to follow the Python indentation rules but Ralph over there hates being told how to format his source code.” Or perhaps you say, “Test-driven development is fantastic, but most programmers are too busy to write tests.” Not to mention when you say, “Parser combinators are fabulous, you can write a really good parser and eliminate all the accidental complexity, but when we outsource maintenance to India they won’t be able to understand my code.”

Those are all valid concerns for somebody, but how about we let those people speak up for themselves, hunh? And it swings the other way, we can argue the Optimistic View about other people. But whether we’re optimists or pessimists, instead of debating things based on our perspective of their experience, let’s debate things based on our own experiences and our own honest likes and dislikes.

As in—and this is my actual personal experience—“I can appreciate how dynamic typing—as exemplified by modifying open classes at run time—makes a program confusing. I feel the same way when I look at a Java program using extensive Dependency Injection driven by external XML configuration files, I have to pick my way through everything to figure out what is really going on at run time.” Note well the use of the word “I” and tying my observation to my actual experience.

That is a very, very different thing than arguing about what might happen to someone else under different circumstances than the ones we live within. Not that there aren’t important things happening elsewhere: You and I are just two people on a planet of billions. But quite honestly, we can go crazy hypothesizing what other people will think and do. I have my hands full just figuring out what’s going on with my code on my projects.

So let’s keep this personal, please. Let’s talk about our actual experience, please.

Thank you.

¶ 1:02 PM

Comments on “Let's Make This Personal, Please”:

This post has been removed by the author.

# posted by

ryan : 1:35 PM

It was fixed, but the fix hasn't propagated through RSS yet. Thanks!

# posted by

Reginald Braithwaite : 1:37 PM

This is a bit of a meta-note. I remember a workshop about estimations; we got a simple problem and should say how long it would take us to solve. The next question was what time we thing the person next to us would need. Ouch. Factor two, and in my case that estimate was closer to the I time actually needed. Ergo: Don't underestimate your fellows, and don't overestimate yourself, er, myself.

The problem was to calculate the number of seconds between the beginning of the workshop and next first of april, and there was a distinct bifurcation in the estimates because some thought of the need to figure out whether there is a leap second to come on the intervening new year, and some didn't (including the speaker).

# posted by

Andreas Krey : 2:47 PM

In my very first experience with Ruby I had to substantially clean up the code of a junior programmer who had overused various metaprogramming features. I believe that Ruby was the first OOP language he had used extensively and I rather wished he had had a year or so in the Java mines to learn more conventional ways to solve problems. (it wasn't that the developer was stupid: quite the opposite -- he was very smart -- smart enough to figure out the quirky bits of Ruby...he just wasn't experienced enough at the time to know that there are simpler solutions)

My impression is that many programming language communities over praise certain techniques: C programmers often get credit even when they prematurely optimize, Perl programmers when they summarize their code beyond readability and Ruby programmers when they metaprogram.

After all, how many blog posts make it to Reddit describing how to use inheritance instead of monkey patching or function calls with symbols as arguments instead of method_missing. Zed criticized the pickaxe book for teaching the basics instead of the cool stuff. Well maybe focusing on the basics is the better approach!

# posted by

Paul Prescod : 7:11 PM

Hmmm, how about this one:

I dislike dynamic typing because I never know, even when I run a program, whether certain simple properties are correct or not -- like I'm passing the correct type to a function. (Does DateDoSomething take a string or an integer...)

I don't know until runtime if I misspelled a method name.

IDEs can't provide me useful information about the source in my dynamic language, because they can't understand it because they can't analyze it.

I can't perform simple tasks like semantic renaming of variables in IDEs for dynamic languages, for the same reason. Yes, it is a problem of the language if IDEs can't be made to reason about it. IDEs are important productivity enhancements.

I have to frequently look up the documentation for functions in dynamic languages because there is not even a compiler to validate whether I'm passing the right types, let alone IntelliSense-type features.

I don't find myself limited by static type systems. I don't mind typing out a 3-line interface. I do mind that Java makes me put it in its own file; but sensible languages like C# and Scala don't.

I enjoy and prefer explicitly creating all interfaces and class hierarchies. I consider it a simplified design of the system. It's easy to change them as the system evolves. It's preferable to have to change them, because the compiler catches mistakes.

I love... having an interface with three methods in it, and I'm working in code which anonymously creates an object of this type. I compile my code and go to debug it, and the compiler tells me I forgot to declare one of them. I love not using a dynamic language and not finding about this silly mistake 3 weeks later when that 3rd method (which only exists to handle an odd corner case) finally gets called and the program crashes.

I find what when I'm using a language with properly strong type inference, like Ocaml or Haskell or Scala or others, that I wonder why anyone would ever want a dynamic language. I don't often do the kind of work where I feel statically typed languages are "inflexible".

My preferences are informed. I know the major programming paradigms and their benefits. I write functional code in Java and lament its lack of recursive data structures. I know a wide variety of programming languages, including Lisp, and rarely find myself missing esoteric features like macros. It's just easier to write a library proper -- it's more accessible to other programmers.

While Lisp macros and similar features might be very powerful, there is a powerful benefit to expressing programming logic in a common set of semantic primitives that everyone, other people who might have to read your code, understands. Macros and monkeypatching and other dynamic features introduce new semantics which add vastly to the difficulty of understanding a program. While they may be powerful and elegant for some features, I don't believe it's worth the cost to introduce new fundamental semantics for most things. E.g., how does this macro interact with scope? Does it capture? Is it hygienic? When you're working with ordinary functions, you know exactly the kinds of ways it could possibly behave. With a macro, you have to look at its definition. Macros are only appropriate in libraries that are very common -- and in this case they're great.

Dynamic language features lower encapsulation of code. To expand on the above, because macros and such features can introduce new fundamental semantics, you can't just 'call' them naively. You must look at the implementation to understand their behavior. Type systems isolate code from other code by the type system -- you know what kind of interactions can occur because they're described by the type system.

I usually want the "right" way to do something, not the "fast" way. I am engineering for very large-scale systems that need to be highly reliable. I don't work for a startup.

I view super-dynamic features like monkeypatching as inherently against the goal of writing reliable software. Whatever productivity these features give you could be obtained at a much lower cost in later debugging time.

I view that the goal of writing reliable software is to automatically, mechanically validate as many (interesting) properties of my program as possible. (I realize that type systems don't catch all mistakes, but that's not very important to the discussion, because that class of mistakes is still a problem in dynamic languages.)

Mechanical validation is awesome. I love unit tests. I love mock objects. Those help you catch significant mistakes sooner. (Dynamic languages have them too, of course)

I love my type system because I love refactoring. I love that when I change some interface one place in the program, everywhere else in the program that uses the interface inconsistently breaks! It won't compile! That's awesome. That's exactly the purpose of the compiler. I love that when I rename variables and classes, the IDE performs this transformation for me. Many other refactoring commands are available.

Of course, again, there is a class of mistakes, such as changing an interface's semantics but leaving it with the same type, that a type system can't help you with. That doesn't matter in a comparison -- the type system helps you with the class of problems that can be caught.

To conclude my comment: I don't believe static type systems help you catch many significant mistakes (at least not Java or C++ style type systems). The only help you catch tiny silly mistakes. This is a very infrequent case when working solely on your own code, and a very frequent case when maintaining and extending others' code, like at a large company.

In the future, I expect static type systems to catch significant mistakes. I expect them to enforce concurrency properties and even algebraically prove that my integer will never go out of bounds. (This is only an untenable problem if you try to write such theorem-solvers for any program. We're dealing with a specific class of programs.)

I can see why some people might like dynamic languages, if they only ever have to work with code they wrote. They remember everything about their own code; a static type system wouldn't help them much.

Working on code written by someone else in a dynamic language is much harder than working on static code. It's harder to understand, and significantly harder to change. Unless the whole program is thoroughly tested, how do you know the object you create conforms to certain expectations?

You might respond, well, how do you know your class properly implements (semantically) the interface in Java? The key is that types often act as documentation. Something is better than nothing. An interface which lists some method names and their arguments, in a static typed program, might have been written as a dynamic program with no documentation at all, leaving the reader with no idea what the type represents at all.

To put it another way: If you do the bare minimum in a static language, you're doing a helluva lot more than the bare minimum in a dynamic language. It forces you to think and declare specifically certain things. This is good and bad, depending on the situation. For large-scale, high-reliability software, you want that structure. It's only helpful. To other people, if not you.

Eventually, dynamic type systems will provide no utility, as a codebase grows as large as (say) Windows or Linux. Monkeypatching in such a system would be absolute hell -- not a reasonable solution.

However -- and this is the most important point of all -- although there don't exist good languages with both static and dynamic typing, there is no reason why a language can't include concepts of both. You could implement dynamic typing in Java by simply adding a language feature for method calls that does dispatch on the object at runtime using reflection. The language Boo already does this: http://boo.codehaus.org/Duck+Typing

I expect "programming languages of the future" to incorporate aspects of both. Type systems will really be more like theorem provers; types express common behavior you want to prove. You can enlist the theorem prover any time you want in your code, or can skip using it altogether.

This is my experience. The biggest problem of software engineering is how to provide reliable software at a reasonable cost; the problem of creating workable software at almost zero cost is a much smaller problem, of startups. This problem is best solved with statically typed languages.

This is just my opinion. I have, in my career, worked on a certain sort of software. Not just boring big company software, but software that needs a reliability. Compilers, IDEs, low-level system components. I recognize that other people have differing opinions and different goals.

Overall, it comes down to this: the complexity of dynamically typed programs grows exponentially with the lines of code, while the complexity of statically typed programs is linear, because all program semantics are described precisely by the types, and isolated by the type system. In dynamically typed programs, you see arbitrary nonlinear relationships between arbitrary parts of the program (monkeypatching, possibly macroS).

This is my view. Please don't flame me for it too much ;-)

Cheers!

P.S. -- I'm working on an implementation of my 'ideal' language, with both static and dynamic types. It looks like Lisp or Logix [http://www.livelogix.net/logix/], with optional types like Haskell. I don't have much of an implementation yet.

# posted by

Justin Crites : 4:39 AM

To make a long story short, my view is basically this:

Minimizing 'side effects' is very important in programming. I would say that side effects are always bad when it comes to understanding a program in sufficiently clear language and type system, but usually good when it comes to writing code.

That is: Side effects make it easy to write code, and hard to read it.

They are OK when the scope of side effects is very small -- like changes to variables of the class instance whose method we are in. They are not OK when their scope is an entire program.

Type systems serve to limit, whether strictly or semi-strictly, the side effects a program may have. (Note that exception conditions are considered side effects in strict pure type systems). Haskell is so strict that literally no side effects can occur in the type system, without you (the programmer) knowing about it (ignoring unsafePerformIO, which is bad style). Java and Scheme are more lax, permitting mutation of variables.

With a statically-typed program, side effects are constrained into individual modules; it's effectively encapsulation. Any class might change its own state, but it probably only makes method calls to other classes. Thus, it's easy for the programmer to correctly conceptually model the program.

In a dynamically typed program, you might have a macro call that does strange shit -- anything is possible. You might have a function that takes in your object, and replaces all of its methods with no-ops, while leaving their names defined.

In a dynamic program, there are absolutely no limits on how some code you're calling can change your environment. In Java, the only way it can influence you, the caller, is if it (1) changes objects you pass it or if it (2) changes objects you both already knew about.

This means that debugging problems in complex systems, you didn't write, is much easier in statically typed languages.

It also means that you can write code more easily, and much faster, in dynamically typed languages. I suspect this observation will resonate with programmers here on Reddit.

You can pound out code like crazy in PHP, Ruby, Perl, Python, whatever. It ends up being pretty hard to maintain if you come back to it later, especially if you use monkeypatching and similar techniques. Exactly what did my Lisp macro do again?

Java is about moderate both to write and read. Easy enough to write that people use it, and its type system is moderately useful at verifying the program.

Haskell and Coq, on the final hand, the extremes of the type system world, sometimes take forever to write. You really have to think to make the code work sometimes, to figure out how to say what you want to say. But once it's done? It damn well works the first time.

# posted by

Justin Crites : 5:04 AM

<< Home