Dr. Geer,
I read your article Heartbleed as Metaphor today, courtesy of Bruce Schneier. There is much in your article I find thought-provoking and relevant, but there is one parenthetical clause included in your analysis that drives me to a severe level of frustration (amplified by its appearance in Schneier’s quote):
Because errors are statistical while exploitation is not, either errors must be stamped out (which can only result in dampening the rate of innovation and rewarding corporate bigness)…
So as per your request,
More if you want it. Lots more. Be in touch.
I am not requesting more information from you at the moment. Rather, I’d like to point out a detail that seemingly every high-profile article on the subject of Heartbleed and the Apple SSL bug (aka “goto fail”) seems to completely ignore: Had Apple or the OpenSSL team promoted the practice of unit testing, there is a high probability these errors would have been detected and corrected long before they shipped. Unit testing excels at detecting everyday, preventable programming errors exactly like those which led to both the Heartbleed and “goto fail” bugs.
I have written a proof-of-concept unit test for the “goto fail" bug and a proof-of-concept unit test for the Heartbleed bug. In the header comments of each source file are links to information describing how to build and execute the tests. I have not been programming steadily since September 2011, yet I was able to dive into two different code bases with which I had no prior experience, and within a few hours, provide small unit tests to both reproduce each bug and verify their fixes.
Though I sound boastful, my intent is instructive: If I could write unit tests for each bug with relative ease, given my rusty skills and ignorance of the code, the original developers could’ve written unit tests that would’ve found “goto fail” in September 2012 and Heartbleed in January 2012.
This does not counter any of your other proposals with regards to monocultures, field-upgradability, abandonment, etc. However, unit testing, once developers are in the habit, incurs relatively little overhead and actually accelerates the rate of innovation. A code base well-covered by small, fast, automated tests provides confidence that each individual code change is not producing unintended side-effects. Unit tests aren’t necessarily a replacement for other tools, processes, and levels of testing, but complement those other development factors wonderfully. They can’t guarantee completely bug-free code, but they do a lot to catch, or even avoid, innumerable programming errors long before they become embarrassing, expensive, and dangerous.
Thanks to the examples set by “goto fail” and Heartbleed, we can now consider unit tests also critical in helping to ensure the security of computer systems, particularly those based on Open Source code. In fact, unit tests can help fulfill the promise that “given enough eyes, all bugs are shallow”, which has been proven not to be an intrinsic feature of Open Source after all. That this concept has not yet entered popular conversation is maddening. Now I know how Paul Krugman feels.
For more in-depth information on how a unit test could have prevented the “goto fail” bug in particular, I invite you to read my draft article Finding More Than One Worm in the Apple. If you’ve only time for a summary, I have a one page treatment of the “goto fail" bug as well as a one page treatment of the Heartbleed bug. In the near future, I will also be publishing an extensive treatment of both bugs combined, which will address issues of development culture as well as code. I will post the link in my blog, and I would be happy to send it to you directly when it is published, if you so desire.
Thank you, and be in touch,
Mike Bland