Whatever happened to code reviews?

(This is a snapshot of my old weblog. New posts and selected republished essays can be found at raganwald.com.)

Friday, January 11, 2008

I have read an argument that a programming language can enforce good programming style by restricting programmers to constructions and mechanisms that are deemed worthy, or perhaps are restricted to a small set so that there is an enforced consistency in a code base. (I am not talking about using compilers to detect bugs or memory management being used to prevent bugs, I am talking about programming languages that do not permit certain kinds of abstractions outright).

The premise of the argument is that if programmers are constrained to a single, universal “good style,” good programs will result.

responsibility and authority cannot be decoupled from each other

In many other forms of management, people have attempted to automate the manager out of a job. There is an idealistic notion that if we write just the right series of instructions for people, if we build the perfect instruction manual, if we detail every possible case and script every possible behaviour, we can build a team that always does the right thing. We see that in highly industrialized occupations like preparing fast food: there is a detailed, step-by-step instruction for everything, and employees simply follow the rules.

Many things have been written about whether that works or doesn’t work for software developers. I am now talking about why that doesn’t work for teams, for team leaders, and for development managers. Every thing you put in place in an attempt to enforce good behaviour takes something away from the team’s responsibility to manage the process using judgement.

a false sense of security

We joke about a compiler generating a false sense of security: it compiles, so it must work. But that is exactly the mind set of choosing a language on the basis of it actively preventing poor style: we are choosing it because it will relieve us of the work of manually reviewing code and analysing whether it is of good style or not. It will relieve us of the work of setting a good example and evangelizing good code. It will relieve us of the work of teaching, of leading, or managing.

In the fast food example, we know that the “managers” in the restaurants are not true managers. They have little or no authority to deviate from the rules. And for that reason, they have little or no accountability for what results: you can fire them for failing to follow the rules, but if they follow the rules and bad things happen, it is the rules that are at fault, not the manager.

And so it is with enforcing “good style” with a programming language. If someone writes code in that language, they are absolved from all responsibility for its style. And so is their manager: the programmer’s code compiled, it is demonstrably free of bad style, what can we demand of their manager other than to see to it that they use the language?

Software development is a young profession. We don’t know an awful lot about what works and what doesn’t work. but I’ll take a stand here: We humans have an awful lot of experience with the relationship between authority and responsibility. And what we have learned is that if we remove an individual’s authority to choose, we absolve them of responsibility for the result, and ultimately performance suffers greatly even if their skills are objectively high.

automatons on the style council

Does this mean that automation is bad? Why do we use high level languages and compilers? Why do we use libraries and frameworks? Don’t those tools also relieve us of responsibility?

Code Complete is one of the greatest books ever written on the subject of shipping software. There is a reason it is subtitled “A Practical Handbook of Software Construction:” Every page is loaded with insights and practices that can help you be a better developer, help your team be a better team, help your manager be a better manager, and help your organization ship software. The wisdom about code style, quality, and review will pay for the book many times over.

Well yes, of course they do. They relive us of responsibility for things we have deemed not to matter. For the vast majority of lines of code, the exact translation to CPU op codes is unimportant, as long as it works. Nobody cares to look under the hood. Compilation is the process of automating the accidental complexity of our CPU.

If you choose to use a programming language with lazy evaluation, you have decided that evaluation order is an accidental complexity that no longer matters to you, and you want the language to sweep it out of sight. What we choose to automate speaks to what we consider irrelevant. As long as it works, we don’t care how it’s done.

So when we automate style, we are deciding that style dosn’t matter. As long as it passes the compiler, it is good style.¹

Automating good style ultimately weakens the notion of individual and shared code ownership, it emphasizes the notion that if it passes the gate, it must be good. That, in turn, creates the illusion that no further inspection is required. If it functions as we require, and it is written in the language that enforces good style, why do we need to review it?

We can talk about why role of a programming language should be to enable good style, and have a terrific debate about what that means and which languages do a good job with it. But it is more important to keep our eye on the ball. The most important factor for the success of software projects is still the people. And what makes people produce good software? Taking responsibility for their work. Ownership and accountability.

Ownership and accountability is weakened when you attempt to constrain the programmer and it is enhanced when you provide a programmer with freedom and follow up with review. Thus, my question:

Whatever happened to code reviews?

Python’s significant whitespace is a good example of this. By enforcing a standard indentation, Python says that indentation isn’t worth thinking about: everybody does it one way, and that’s a benefit for readability. Compare and contrast this to a hypothetical language Imperitivity. Imperitivity does not have anonymous functions or closures as first-class objects, and many Imperitivity programmers argue that they should not be added to the language because that would be bad style, or it would promote inconsistent style with some people using closures and some not.

Would we seriously equate the two positions? I think not: whitespace really is pretty trivial: the fact that we can write a script or IDE plug-in to “fix” indentation to match a style guide is evidence that it should not take up valuable team and management attention.

However, the question of whether a code base should include closures, should not include closures, or ought to enforce a consistent approach to closures is important. Using a language to settle the question is an abrogation of responsibility: the matter should be decide by the team or technical leadership and enforced through review, mentoring, training, example code, and so forth.

The salient difference is that while consistent indentation is useful, indentation itself is unimportant in comparison to other technical choices the team must make. Important choices, such as the appropriate paradigms or idioms to use, ought to be decided by the team and enforced through review.

¶ 1:38 PM

Comments on “Whatever happened to code reviews?”:

"But Reg, I have a team of a hundred geographically-distributed developers shuffled in and out of several different projects at a time. It's too difficult to vet their skill sets and knowledge before we hire them, and we have just too much code to review. Can't James Gosling tell us how to write maintainable code?"

# posted by

chromatic : 2:03 PM

I review 100% of the code which goes into our products, and generally others see it as well. We use StarTeam to enforce review of all code changes.

And yes, we catch a huge amount of bugs this way which are often the sort of bug that's difficult to catch in testing. Stuff like missing resource protection blocks, cut and paste errors, etc.

# posted by

Craig : 3:03 PM

Good post

IMO a code review is not only about catching bugs, it's about making sure that the code is easy to read, that it means something, that it reflects the ubiquitous language and that it will be (hopefully) possible to maintain, extend, unit test, etc

Code reviews are sometimes not being done because as long as it works (can be tested in other ways) a lot of people don't care what the code looks like. It's also too hard, people don't know how to start and then you have the clashing of egos.

And when they are done the reviews often just touch the "surface".

# posted by

Manuel Palacio : 3:57 PM

There are other issues. Languages can make strong guarantees if they ban (or at least express serious displeasure with) certain constructs. For example, type safety in a language like Java is guaranteed by the lack of pointer-level concepts and the presence of a GC. Those things being present eliminate a certain amount of programmer responsibility, but it gives something more valuable back.

There's a tradeoff to be found. If the language doesn't have enough policy, you can easily end up with very different code in different sub-projects or libraries brought in. You can see some of this in C++; code using reference-style, almost akin to Java classes, versus the value-oriented object style favoured by the standard library. Consider also the tradeoff between polymorphic containers and generic containers, and hybrids: the language's policies make different strategies, which have value in their own right, more workable or less. It's easy in C# to make polymorphic containers work because everything is System.Object derived and things autobox. The same effect in C++, across code from multiple libraries, isn't nearly so easy.

# posted by

Barry Kelly : 4:10 PM

Barry:

Languages can make strong guarantees if they ban (or at least express serious displeasure with) certain constructs.

True. But as I said: “am not talking about using compilers to detect bugs or memory management being used to prevent bugs.”

type safety in a language like Java is guaranteed by the lack of pointer-level concepts

I am not convinced of this point, due to the existence of type casts. Other languages replace casts with different mechanisms taht guarantee soundness. But FWIW, Java's exception-throwing casts are better than C's "damn the torpedoes" blind casts.

If the language doesn't have enough policy, you can easily end up with very different code in different sub-projects or libraries brought in.

Ahem. What is a code review for, if not to establish the appropriate code for each project and to choose the appropriate libraries to bring in?

# posted by

Reginald Braithwaite : 4:30 PM

We have a distributed team so we decided to use Review Board. So far it working very well.

http://www.review-board.org/

# posted by

jasonW : 6:24 PM

Isn't the theory that pair programming is an immediate kind of code review?

Although I must say I like the Mozilla approach -- any fix to a bug must be submitted in the form of a patch, approved by a reviewer, and then signed off by a super reviewer. It makes for some very tight oversight.

# posted by

Will Sargent : 11:40 PM

Python says that indentation isn’t worth thinking about: everybody does it one way, and that’s a benefit for readability.

Except for the huge arguments in the python community over tabs versus spaces.

# posted by

Sean : 1:33 PM

If Sean's implicit point is that indentation *is* worth thinking about, then I agree with him.

There's not a *huge* debate about tabs vs. spaces in most Python communities that I've been in, and the tabs-vs.spaces debate transcends Python anyway. But again, I agree hopefully with Sean and definitely with the blog that indentation is a relatively minor issue compared to more fundamental code decisions.

I would argue that one of the reasons that Python enforces a certain style of indentation is to let Python progammers get past stylistic considerations and on to real issues. If a language is rigid in a dimension that doesn't really matter, but flexible in other dimensions that do matter, then I view that as a good thing.

I think it's unfortunate that Python didn't go far *enough* in enforcing a particular indentation style. Guido should have insisted on spaces only.

# posted by

Steve Howell : 8:53 PM

Reg,

It's hard to argue against code review, but I do think it's a valid purpose of a language to facilitate certain good ways of expressing programming solutions, and also to prevent certain bad habits.

# posted by

Steve Howell : 8:58 PM

Will,

I think it's more than theory that pair programming enables code review. I've seen it in practice.

For a project like Mozilla, I think it's completely valid to gate the check-in process by requiring a reviewer and a super-reviewer. But I also think that in a small, cohesive team, that's overkill.

My opinion is that pair programming is the most frictionless way to enable code review, and I think code review is important. But if you don't have pair programming, you do need to institute some kind of process that not only makes people accountable for their checkins, bug that also facilitates some kind of discussion about changes to the code.

# posted by

Steve Howell : 9:18 PM

it's a valid purpose of a language to … prevent certain bad habits.

Getting me to agree with t depends on what you mean by “certain bad habits.”

Some people believe that using “confusing” programming techniques like parser combinators are bad habits on certain projects. That may be true for those people on that project, but I have a hard time with the idea that a language should make writing parser combinators hard for this reason.

That is the sort of thing that really needs to be decided and enforced by people working with people.

# posted by

Reginald Braithwaite : 11:39 PM

Reg,

My favorite languages are dynamic languages, so I'm definitely biased toward languages allowing you to express more, not less, and buyer beware if you abuse certain features.

But I also think it's a language's role to minimize the number of arbitrary stylistic decisions a programmer has to make. For example, in Ruby, you can omit parentheses in method calls. It's a nice feature, and I do use it, but it's a feature that's hard to use completely consistently, so it leads to inconsistent looking code, especially in a team environment where different people have different aesthetics.

I've spent enough time in the world of Python to believe that "There should be one-- and preferably only one --obvious way to do it." I think there's some wisdom in that philosophy. Python has a certain discipline to it, yet I never felt constrained by it.

# posted by

Steve Howell : 4:51 PM

Reg,

You will notice that I responded to your question, but I dodged the main substance of your question, which is about "bad habits." I think it's a good question, and I couldn't come up with a great answer, but that's okay, you helped me clarify my thoughts.

I enjoy your blog, BTW, please keep posting interesting thoughts.

# posted by

Steve Howell : 5:00 PM

in Ruby, you can omit parentheses in method calls. It's a nice feature, and I do use it, but it's a feature that's hard to use completely consistently, so it leads to inconsistent looking code, especially in a team environment where different people have different aesthetics.

This statement raises several issues for me. First, and most obviously, if you have a team environment where different people are doing it different ways, where is your code review?

This exact thing happened to me when I joined Mobile Commons. In my first code review, someone asked me to parenthesize all method calls for consistency with the rest of the code base, so I did.

If different people are doing it different ways, I infer that one of two cases holds: your team isn't reviewing code, or your team doesn't think it is important enough to establish a standard.

As for Python enforcing it, I'm actually all for a language that only does method call syntax one way. This is not a reversal of what I just said, it is a different point: method call syntax simply isn't important, and by choosing one way to do it, Python is saying that the decision is not important.

On the other hand, I have real issues with a language only having one way to do important things, like "The only way to abstract functions is through unwieldy anonymous inner classes that cannot reference mutable local variables in scope."

That's a much more important kind of design decision. I'm ok with a team deciding not to abstract functions in their code, but not ok with abrogating that design decision to the language.

# posted by

Reginald Braithwaite : 6:12 PM

Reg,

In the case of my team deciding if/how we want to parenthesize method calls in Ruby, my team does plenty of code review (mostly via pair programming), but we don't think it's an important enough issue to set a standard (and so we have both styles in our code).

A little bit of stylistic variation in the code is obviously not the end of the world. In fact, even though our team has no de juro standard for how to parenthesivze methods, our team does tend to put them in some places consistently and omit them in other places consistently, so that parentheses can send a signal to developers in some cases that Python wouldn't allow. (Conversely, Python has the benefit of being less expressive here stylistically, so you never worry about unintentionally sending the wrong signal).

# posted by

Steve Howell : 6:37 PM

I assume this example comes from Java: "The only way to abstract functions is through unwieldy anonymous inner classes that cannot reference mutable local variables in scope."

Python actually has a very pure way to abstract functions, in that you can pass functions around willy nilly, and then you don't need any special syntax to call them. I'm not too hung up on syntax, but clean syntax does create good affordances for programmers to treat functions as something easily passable, and then better code results.

Do you have other examples of languages constraining programmers?

# posted by

Steve Howell : 7:01 PM

<< Home