(This is a snapshot of my old weblog. New posts and selected republished essays can be found at raganwald.com.)

Sunday, November 18, 2007
  The best question asked this week: What, exactly, are you trying to prove?

To prove the correctness of code, you need to know exactly what correct behavior is for the code, i.e. a complete and unambiguous specification for what the code should do. So tell me dear reader, when was the last time you received an unambiguous fully detailed specification of an application?

If I ever received such a thing, I would simply execute that sucker, because the only unambiguous complete spec for what an application does is code. Even then, you have to ask, how do you prove that the specification does what the customers want?
—Phil Haack, What Exactly Are You Trying To Prove?

Proving that the software meets the rigorous specification written by the customer before they had seen the software work, before they had tried it, has been of secondary importance to me in my career. The overarching concern has been the transformation of the requirements from what the customer guessed they might need, based on superficial desires and buzzwords, into what the customer discovered was actually important, over the lifetime of the project.

No, no, no! You don’t know what you want!
—Luigi, Cars

Hey, I am not blaming customers. They don’t know what they want because nobody really knows what the optimum finished software looks like before development begins. Not the Customer, not the Developers, and not even the “Architects,” “Product Managers,” “Business Analysts” or anybody else who think they are a requirements expert.

Software development is a process of discovery, for the customer, for management, and for development. As information is acquired—through inquiry, inspection, or experience gained from misadventure—our understanding of what we are trying to accomplish is refined. In that respect, every tool, be it a programming language, a development process, or a testing practice, must be judged for its contribution to the transformation of both the requirements and the software over time.

I am not saying that strong typing or provability, iterative development or BDD, or anything else is necessarily better or worse in this respect. I am just pointing out that the surface, buzzwords for these things—fewer bugs, provability, test coverage, separation of concerns—are not the goal, they are ways and means of achieving the goal, which is the discovery of the true requirements and the delivery of working software which meets the true requirements over the project’s complete span.

Comments on “The best question asked this week: What, exactly, are you trying to prove?:
Most software I've worked on has user-visible parts that change rapidly while customers change what they want, but also fully internal parts that change very slowly. Examples would be a messaging layer, or a persistence layer, or maybe some core algorithm. While it's pretty much impossible to rigorously specify in full how the user-visible parts should work, it is much more possible to do so for the internal facilities.

For example, proving that Google Reader is "correct" is probably meaningless, and even if it wasn't, the definition of correct changes so rapidly that it would be an exercise in futility. But, it may be possible to prove that Google Reader's Atom parser correctly parses feeds in accordance with the standard.

That being said, I think trying to "prove" code is correct is still only a fun game for academics. In the real world we have to settle for heuristics, like unit tests or maybe code coverage.
As you say, proving that a parser satisfies a spec sounds useful. It's not the most convincing use case I can think of, though. Once you write a correct parser, you're basically done until the spec changes, and the parser might be not much more fundamentally complicated than the spec is. Not so for optimizing compilers (aka the Compiler Writers Full Employment Theorem). If in practice you are going to be tempted to make a growing optimizing compiler or a growing number of variants or both, having an automatic technique to relate their correctness to a fixed-size spec could be rather useful.

Unfortunately there are also other serious limitations to proving programs in practice. E.g., it can be difficult to construct proofs of things as complicated as practical programs, and to put proofs in such a pedantically exact form that a machine can check them. Even pushing simple mathematical theorems through a prover like Coq can be fairly tricky.

The circuit design folk face the same fundamental problems: "but there might be a bug in the spec" and "a SMOP can be even larger and more complicated when P means Proof than when P means program." In practice, though, for some useful fraction of their designs they can just beat the proof problem to death with techniques like binary decision diagrams. I don't know of any important subfield of software engineering where similarly useful progress has been made, so today in software proof does seem to be only a fun game for academics. But BDDs are a fairly simple trick, and as I understand it, their effectiveness in practice came as an great surprise to most people. So just because proof remains an academic game for a given subfield in year Y doesn't give me complete confidence it will remain an academic game in year Y+4.
There is validation and there is verification or as commonly known to most software engineer as the V&V method.
Verification is about making sure you might the product right. This involves unit test, integration tests, etc
Validation is about making sure you make the right product. This involves demonstration, white box testing, testing to specification.
If you don't know what exactly you are trying to prove, chances are you have no idea what you are building.
All your process of discovery should be completed before you start development otherwise you don't understand the problem statement.
Yes there are iterative approach but this only happens once you are using a build plan, that is delivering product phase by phase. At each delivery, you consult your customer, change your requirement, and your validation test plan.
In addition, you need to actually agree with your customer on some common terms and remove ambiguity. Things like "fast", "correct" are extremely rubbish and subjective. Sit down and learn actually software quality metrics that the industry use to measure quality.
In the real world, if you actually apply half of what you learn from university and colleges, there would be less projects that run late or cancelled
Good point. I think technology isn't that important: 90% of failed projects were already doomed by the time the spec landed at the developers.

I think the main reason is that software design is like soccer: everybody thinks they know how to do it right. Just like when the guy who is paid $30K a year to repair cars claims to know how that kick should have been caught better than the goalie is paid $3M a year to do just that, users, managers, all sorts of laypeople start their change requests like "Could we add a button that..."

This is the hardest part of the job to convince people they should not design software, because they cannot, just stick to defining the expected results.
For many, doing the software development isn't the hard part.

Understanding the needs (especially the usually scant "requirements" provided) for the initial development is a big challenge.

Even after fully tested, user-validated, and signed off by the users, an ongoing challenge is identifying or interpreting additional system changes requested and properly what it will take to implement the changes without causing problems immediately or down the line.

Success isn't the software - it is in the product and how successful it is lies in the hands of the users. Incorrectly understood (and thefore incorrectly applied) requirements will and usually do cause all kinds of problems, sometimes way into the future.

I've never been much for all the buzzwords, nor have any manager I have worked for/under either. I guess that has been a good thing :)
Have you read any literature on program verification? Dijkstra's "A Discipline of Programming" introduces the topic well.

This particular criticism is neither novel nor insightful.
Nothing in my remarks here should be construed as critical of program verification.

Thank you for making it obvious that my writing does not seem to communicate my intent as clearly as it could.
In the same (quoted) paragraph this guy is talking about two distinct things.

To prove the correctness of code, you need to know exactly what correct behavior is for the code, i.e. a complete and unambiguous specification for what the code should do.

Discusses the correctness of code. "Code", as it is used here, can mean anything from a measly little if statement to the source of a full-blown desktop environment.

With monumental amounts of time, effort, and unambiguous specification you probably could prove that a desktop environment functions correctly according to the specs. That said, with much more reasonable amounts of time, effort, and specification you can prove things about portions of a desktop environment. Like, for instance, that the window manager can never lose focus.

The following sentence:

So tell me dear reader, when was the last time you received an unambiguous fully detailed specification of an application?

Is talking about an application. Of course when we get to something as large and ill-defined as an application correctness proofs won't be much help. Most people that use them will readily agree with this.

The value of proving that code is correct is in plugging all of the leaks in the abstraction. I can write code that works with the xmonad window manager with the assurance that xmonad will never ever ever cause focus issues.

Proving code to be correct makes it possible to interact with it under the assumption that it just works. If statements just work, while loops just work, and proven correct window managers just work. Forget about the details, get on with attempting to decipher the rest of the application's requirements, these bits of code will never cause you a problem.
As anonymous#6, I apologize if I've misunderstood your post.

The question itself is good. I found the original author's tone derisive and dismissive.

Unfortunately, the long-running debate history about program verification has produced a number of throwaway criticisms. "Euler's proofs were flawed" is one. "What are you going to prove?" is another.

If you strip away the historical baggage, the question itself is interesting. I would say that any introduction to formal verification should address it.

The real answer is that you should begin by selecting a a useful property, breaking it down into several smaller ones and formulating a predicates that describes these. It may well be that within a program, any invocation of function foo() must not occur between calls to function bar() and baz().

Formal verification must apply from the bottom-up (as verifications need to rest on a base of more elemental verifications). The application then benefits not from improvements to its totality (I would not try to formulate the predicate "This program produces pleasing poetry") but from control of and confidence in its skeleton (that it is free from deadlock, that it has no more than N files open simultaneously).
As anonymous#6, I apologize if I've misunderstood your post.

You are not a number. You are a free (wo)man!

Seriously, no need to apologise, although since you did, thank you for the civility.

I don't have much of an expectation from comments, other than they are obviously more freewheeling and less formal than my writing things on my blog or people responding on their own blogs.

So there is, IMO, a little license to say what you mean as directly as you please.

It has provoked me into thinking about my feelings... I do have a quarter-baked idea for another blog post (judging by my output, it must be only one quarter-baked: if it was half-baked, I would post it).

My thought is that using the phrase "proving program properties" on an actual project is precise and defensible, but not helpful without asking "Which properties? And why?"

And using the word "correctness" is nearly useless. Note that I am talking abut the words, not the ideas.

I wonder if what really matters to me personally from verifiability is establishing consistency: that two or more different things in an application are consistent with each other.

In the case of specs and code, instead of saying "we prove the code correctly implements the spec," we say "the code and spec are consistent with each other."

And in the case of strong and expressive types, we say "the code is consistent with itself."

Because the inverse, showing that the code is inconsistent, such as sending a message to an reference that might be null at run time, is always a bad thing.

There's my quarter-baked blog post. I'm sure it's an idea that has been examined before... does anybody have an URL to share?
As to the implicit criticism of proofs:

It's easy to read this article like "proofs suck because they only work if you write a huge proof that the program satisfies some particular spec and the spec keeps changing on you". At the most generous it seems silent on the relative value of proofs vs. anything else.

The first part of that is what somebody with no experience proving things about programs might reasonably infer from the only sentence that actually had any content about your experiences with program verification: "Proving that the software meets the rigorous specification written by the customer before they had seen the software work, before they had tried it, has been of secondary importance to me in my career." Gee, that's what proving programs is like? Who in their right mind wants to do that?

Then you go on to the main point, which I think was that trying to establish the correctness of a program by any means (I'd say verification, but you seem to want that word for proofs in particular) is relatively less valuable than making prototypes and talking to customers and so on when requirements are poorly known. Contrasting this specifically with proof suggests you have reason to believe fluid requirements specifically make proofs less valuable, and the argument doesn't apply to other sorts of verifications like test suites or code audits or casting entrails.

It's like replying to an article skeptical about the value of bikes compared to walking with an article warning that if you "ride bikes too much" you can get sunburns or even skin cancer, in the context it implies this is a reason walking is better than biking (even though you might very well catch more rays on foot).

You are arguing with yourself, not with me.

Here's what you did: you drew your own inferences what I said, then argued with the inferences, not with what I said.

Also, I must have missed the point in my article where I suggested casting entrails at all, or equated them with talking to customers.

Would you be so kind as to forward the article you read to me for my amusement? I do not recognize it from the post I actually wrote.
Oh, I don't think I'm arguing or complaining about the article at all. I didn't think the article was critical of provability, or even about proof vs testing at all. I did have a vague impression of a suggestion that proving things about software wasn't really a good idea. It took a while to even figure out where that impression was even coming from, so I thought I'd write it down, and hopefully help you figure out where people were getting that idea.

I don't think I claimed you wrote anything untrue, or even anything likely to be misinterpreted, just some things that might be vaguely unsettling to some pedantic souls. I guess I'm just ambiguous myself, open to even more damning interpretations when I'm just trying to offer some unemotional comments on an equanimous article :(

Casting entrails was supposed to be an example of another possible verification technique, which is presumably also harder to apply when requirements are not known and less valuable when you really ought to be talking to your customers anyway.

For the record, I think proving stuff about functional programs is often fun and would like to see more functional languages pick up type systems useful for the job.
Imperative languages are more trouble (aliasing), libraries and things like databases or soft real-time constraints even more so.

It would be interesting to look at program proofs and test suites in the context of changing requirements. I wonder if an internally specified program in a dependently typed language might handle changing requirements with a type-error driven refactoring like a typed language handles changing representation types. I haven't written anything like Lambda Tamer to try it out for myself.
Non6 again

"Proven correct" implies that we know a set of predicates concerning the program's behavior which, put together, define "correctness".

For mathematical programs, one can specify initial and final predicates easily. These will typically specify the program completely.

For a more complex piece of software, for example, a content management system. A total specification would be both over-rigid and useless. There would be no benefit to writing a predicate for every bit of text on every page.

For larger systems, consider the benefit to specifying other behaviors: the program closes all its sockets, the program consumes no more than 20MB of memory, the program does not write files outside of a given directory, the program does not deadlock.

The benefit to such an approach is that the project designer or architect can specify those types of properties and then derive other rules from these (foo() only occurs (dynamically) between calls to bar() and baz()).

The tools and techniques that prove correctness for mathematical software (wrt easily specified predicates) can easily apply to designer-specified properties in more real-world projects.

The benefit of this is that one architect or designer could more easily manage a piece of software (ensure it implements the design and follows the specified rules).

I don't think I follow your line of thought about consistency. My first thought is that somebody has to encode the spec's "consistency-items" into a semantically-unambiguous form. After you've done that, I'm not sure that we're talking about such different things.
"All your process of discovery should be completed before you start development otherwise you don't understand the problem statement."

If this really worked they wouldn't still be inventing alternatives like Agile Methods and so on. The problem is that 'the process of discovery' cannot, in practice, be complete before building the software.

You are right, we don't understand the problem statement. Worse, there is no 'statement'.

<< Home
Reg Braithwaite

Recent Writing
Homoiconic Technical Writing / raganwald.posterous.com

What I‘ve Learned From Failure / Kestrels, Quirky Birds, and Hopeless Egocentricity

rewrite_rails / andand / unfold.rb / string_to_proc.rb / dsl_and_let.rb / comprehension.rb / lazy_lists.rb

IS-STRICTLY-EQUIVALENT-TO-A / Spaghetti-Western Coding / Golf is a good program spoiled / Programming conventions as signals / Not all functions should be object methods

The Not So Big Software Design / Writing programs for people to read / Why Why Functional Programming Matters Matters / But Y would I want to do a thing like this?

The single most important thing you must do to improve your programming career / The Naïve Approach to Hiring People / No Disrespect / Take control of your interview / Three tips for getting a job through a recruiter / My favourite interview question

Exception Handling in Software Development / What if powerful languages and idioms only work for small teams? / Bricks / Which theory fits the evidence? / Still failing, still learning / What I’ve learned from failure

The unary ampersand in Ruby / (1..100).inject(&:+) / The challenge of teaching yourself a programming language / The significance of the meta-circular interpreter / Block-Structured Javascript / Haskell, Ruby and Infinity / Closures and Higher-Order Functions

Why Apple is more expensive than Amazon / Why we are the biggest obstacles to our own growth / Is software the documentation of business process mistakes? / We have lost control of the apparatus / What I’ve Learned From Sales I, II, III

The Narcissism of Small Code Differences / Billy Martin’s Technique for Managing his Manager / Three stories about The Tao / Programming Language Stories / Why You Need a Degree to Work For BigCo

06/04 / 07/04 / 08/04 / 09/04 / 10/04 / 11/04 / 12/04 / 01/05 / 02/05 / 03/05 / 04/05 / 06/05 / 07/05 / 08/05 / 09/05 / 10/05 / 11/05 / 01/06 / 02/06 / 03/06 / 04/06 / 05/06 / 06/06 / 07/06 / 08/06 / 09/06 / 10/06 / 11/06 / 12/06 / 01/07 / 02/07 / 03/07 / 04/07 / 05/07 / 06/07 / 07/07 / 08/07 / 09/07 / 10/07 / 11/07 / 12/07 / 01/08 / 02/08 / 03/08 / 04/08 / 05/08 / 06/08 / 07/08 /