Mike Bland

Music student, semi-retired programmer, and former Googler
mbland@acm.org - archives - tags - subscribe - about - Google+ - Atom feed

Most recent posts

67 posts total. See Filtering and Navigation for tips on how to find the bits in which you're interested.

The Official Apple SSL Bug Testing on the Toilet Episode
While My Heart Gently Bleeds
Heartbleed

The Google Testing on the Toilet team has published my episode about the Apple SSL bug, and I explain why this is for the greater social good.

- Boston
Tags: Apple, Google, goto fail, grouplets, Heartbleed, Testing Grouplet, TotT
Discuss: Discuss "The Official Apple SSL Bug Testing on the Toilet Episode" on Google+

Pinch me! It’s like a dream come true! Thanks to the generous opportunity afforded me by the Google Testing on the Toilet team, specifically Andrew Trenk and Jim McMaster, I’ve contributed an actual, official TotT episode—derived from my TotT-inspired treatment of the Apple SSL bug—that’s being published in Google restrooms everywhere this very week! It’s officially known as Episode 3271: “Finding More Than One Worm in the Apple”. (Yes, the same name as the Apple SSL bug article I submitted to the Communications of the ACM.)

I’d also like to say thanks to Chris Conway for apparently being the first to actively suggest the idea of turning my earlier treatment into a proper TotT episode, as well as to all of the folks who reviewed my Apple SSL bug slide deck and article. This truly has been, and continues to be, an inspired team effort, reminiscent of ye good olde days with the Testing Grouplet, et al.

WARNING: This announcement isn’t entirely a love-fest. I have a very small favor to ask of everyone, which I hope adds up to a very large favor in aggregate network effects—and things get heavy, seriously heavy towards the end of this post.

TotT Firsts
Seize the Moment
The Greater Good
Footnotes

TotT Firsts

This TotT episode breaks new ground in a number of ways. I’m the first “outsider” to contribute an official Testing on the Toilet episode—specifically, I’m the first ex-Googler to do so. This is the first time a TotT episode has been explicitly derived from an earlier work, with an author attribution and license notice embedded in it. This is the first TotT to use QR codes next to the ads at the bottom.2

Finally, this is the first time a TotT episode has explicitly addressed a real-world software defect, using it to show specifically:

  • how testing could’ve detected the bug and likely prevented it from even being written;
  • how the bug provides evidence of higher-level code quality and cultural issues; and
  • how code quality strongly depends on engineering culture.3

Yes, TotT has from the beginning aimed to address specific technical issues and influence engineering culture, but this is its first retrospective on a concrete, user-visible event based on a software bug.

On a personal note, despite the fact that I wrote a half-dozen or so TotT episodes back in the day—including the Test Certified/Test Mercenaries TotT episode (original image source)—this is the first episode that actually carries my name on it.4 Maybe one day yet I’ll catch up to my friend (and rival) Antoine Picard for the record of most episodes written! (Unless someone’s already out-written us both already, of course.)

Seize the Moment

In other news, I’m still waiting to hear back from Communications of the ACM. Searching for “cacm publication decision time” turned up a 2010 editorial from the CACM entitled Revisiting the Publication Culture in Computing Research in which Editor-in-Chief Moshe Vardi notes that, at least at that time, “The average time to editorial decision for Communications is under two months”.

That said: I need your help raising awareness of these issues. If anyone knows someone, or knows someone who knows someone at Communications of the ACM, I’d deeply appreciate an extra good word put in to increase the chances of getting my article published. I’d appreciate reshares of my blog posts or submissions to any other appropriate outlet. Suggestions for where to submit posts myself are also welcome. If you’re feeling especially generous, post a link to one of my articles or code samples as your status message. You could even post one of the original printer-friendly Finding the Worm or While My Heart Gently Bleeds articles in your office (and not just in the restrooms).

Allow me to make clear the reasons behind my sense of urgency.

I’m hopeful that this TotT episode, combined with getting the full article published in CACM (or another publication, should CACM decline), will really drive discussion around the Apple SSL and Heartbleed bugs, spreading awareness and improving the quality of discourse a few notches—not just around these specific bugs, but around the topics of unit testing and code quality in general. These bugs are a perfect storm of factors that make them ideal for such a discussion:

  • the actual flaw is very obvious in the case of the Apple bug, and the Heartbleed flaw requires only a small amount of technical explanation;
  • the unit testing approaches that would’ve prevented them are very straightforward;
  • user awareness of the flaws and their severity is even broader than other well-known software defects, generating popular as well as technical press; and
  • the existing explanations that either dismiss the ability of unit testing to find such bugs or otherwise excuse the flaw are demonstrably unsound.

If we don’t seize upon these opportunities to make a strong case for the importance and impact of automated testing, code quality, and engineering culture, and hold companies and colleagues accountable for avoidable flaws, how many more preventable, massively widespread vulnerabilities and failures will we see? What fate awaits us if we don’t take appropriate corrective measures in the wake of goto fail and Heartbleed? How long will the excuses last, and what will they ultimately buy us?

And what good is the oft-quoted bedrock principle of Open Source software, Linus’s Law“Given enough eyeballs, all bugs are shallow.”—if people wear blinders and refuse to address the real issues that lead to easily-preventable, catastrophic defects?

The Greater Good

My insistence upon pursuing these issues has no basis in any sort of ill will towards Apple, OpenSSL, or either of their developers. By way of analogy: General Motors knowingly continued to install a flawed ignition switch in new cars from 2002-2006, and neglected to issue a recall, causing thirteen wrongful deaths due to sudden engine shutdowns. Apparently GM declined to change the switches in 2005 because it would have added about a dollar to the cost of each car, and updated them quietly in 2006 without changing the part number—an apparently willful deception which may lead to criminal charges.

Are articles reporting this story mean-spirited attacks on GM, or a means of holding GM (and others) accountable by presenting evidence for the sake of transparency and the greater social good?

I have worked to produce artifacts of sound reasoning based on years of experience and hard evidence—working code in the form of the Apple patch-and-test tarball and heartbleed_test.c—to back up my rather straightforward claim: A unit testing culture most likely would’ve prevented the catastrophic goto fail and Heartbleed security vulnerabilities from ever existing.

True, testing can prove only the existence of bugs, not their absence, and you can never expect to find literally every bug with unit testing. But that is not a sound excuse not to try to catch all the ones you can. Here’s a question I’d like testing skeptics to answer: How can one have more confidence in untested than tested code, especially as the complexity wrought by the number of features, contributors, and users combinatorially explodes? As the stakes get higher the more people depend on technology to ensure their privacy, the quality of their critical personal and professional business, their physical well-being, their very physical safety?

Given the extent to which modern society has come to depend on software, the community of software practitioners must hold its members accountable, however informally, for failing to adhere to fundamental best practices designed to reduce the occurrence of preventable defects—and must step forward not to punish mistakes, but to help address root causes leading to such defects to the extent humanly possible. Society deserves solutions, not excuses.

Footnotes

13 × 3 × 3 == 27. I’ve always been fascinated by that number, and not just because of the 27 club.

2The first TotT to use a QR code at all, if memory serves, was one suggested by Nathan York to Mark Striebeck and myself during the TAP Fixit, whereby the content was one giant QR code linking to more Test Automation Platform information. I came up with the name, “Too Much for One Toilet”, and John Penix produced the episode, embedding the letters T-A-P within the code (which its error correction could tolerate). There was also another episode using QR codes as an example in demonstrating some testing concepts.

3There was an episode in early 2009 which dealt with testing data in the wake of the every-search-result-is-malware bug that appeared one weekend in January 2009.

4Well, the first time an episode originally ran with my name on it; apparently I’d forgotten that the Blast from TotT Past rerun of my early episode introducing Pyfakefs listed me as the author. Thanks to Andrew Trenk for reminding me about that!




A Testing on the Toilet-inspired article about the Heartbleed bug and how it could have been prevented

- Boston
Tags: Apple, goto fail, Heartbleed, programming, The Beatles, TotT
Discuss: Discuss "While My Heart Gently Bleeds" on Google+

Printer-friendly version: While My Heart Gently Bleeds

I look at the world and I notice it’s turning / While my guitar gently weeps
With every mistake we must surely be learning / Still my guitar gently weeps
—"While My Guitar Gently Weeps", George Harrison

The “Heartbleed” SSL vulnerability existed since the release of OpenSSL 1.0.1-beta in January 2012. The flaw was not discovered and fixed until version 1.0.1g in April 2014. A missing check on the client-supplied buffer size for a TLS heartbeat made it possible to retrieve 64k of memory from of a vulnerable system without leaving a trace. (Details, Analysis)

/* From ssl/d1_both.c;
 * also appears duplicated in tls1_process_heartbeat() in ssl/t1_lib.c
 */
int
dtls1_process_heartbeat(SSL *s)
  {
  unsigned char *p = &s->s3->rrec.data[0], *pl;
  unsigned short hbtype;
  unsigned int payload;
  unsigned int padding = 16; /* Use minimum padding */

  /* Read type and payload length first */
  hbtype = *p++;
  n2s(p, payload);
  pl = p;

  /* ...snip... */
  if (hbtype == TLS1_HB_REQUEST)
    {
    unsigned char *buffer, *bp;
    int r;

    /* Allocate memory for the response, size is 1 byte
     * message type, plus 2 bytes payload length, plus
     * payload, plus padding
     */
    buffer = OPENSSL_malloc(1 + 2 + payload + padding);
    bp = buffer;

    /* Enter response type, length and copy payload */
    *bp++ = TLS1_HB_RESPONSE;
    s2n(payload, bp);
    memcpy(bp, pl, payload);

The code introducing the vulnerability was reviewed, then merged into OpenSSL with no tests added by the commit—demonstrating that code review, while important, is not sufficient to catch all serious bugs. Some have asserted that the bug could have been found with a small unit test, while others assumed it would have required a system-level test. The proof-of-concept Heartbleed unit test proves that a small unit test could have been written to detect the bug, or could have accompanied its fix to verify it and prevent a regression.

While it is easier to write the test after knowing where to look for the bug, a unit testing culture would have likely required a series of smaller, well-tested changes for the original feature, greatly increasing the probability that the flaw would have been discovered early. This is because good testing practice involves testing invalid inputs as well as valid, and good code review practice would require such cases to be covered by tests. People used to such a culture would be more sensitive to invalid input issues and mindful of how to test for and avoid them. As doctors wash their hands before surgery to prevent deadly infections, unit testing security-critical code can help prevent catastrophic vulnerabilities.

As many of us know by now, applying unit tests to an existing code base is a solved problem. The only thing standing in the way of fixing preventable bugs like this before they ship is the cultural bias against unit testing, either because it is not considered, not taken seriously, or is outright dismissed. Along with the Apple SSL bug, the Heartbleed bug provides another prime opportunity to demonstrate the value of unit testing to the entire tech community.

The OpenSSL team has provided a great service to the tech industry for years with an otherwise exemplary track record; as a result, many have excused the Heartbleed bug due to the small, all-volunteer team being understaffed and underfunded. However, in light of Heartbleed, we must not dismiss one of the most serious computer security flaws in years as unpreventable. We can help Open Source projects not just by contributing features, fixes, and money, but also by demonstrating how to provide better unit testing—by contributing unit tests to Open Source projects and writing blog posts, journal articles, etc. Given the availability of the open-source code, each one of us can help to seize such precious opportunities to fix the problem of severe, undetected, preventable bugs.




I've written a complete proof-of-concept unit and regression test for the Heartbleed bug, and am pretty happy with it

- Boston
Tags: Heartbleed, programming, technical
Discuss: Discuss "Heartbleed" on Google+

I’ve taken my best stab at writing a complete unit/regression test for the Heartbleed bug and I’m fairly happy with the results. You can grab it here: heartbleed_test.c

It exercises all of the code paths introduced in the fix for the Heartbleed bug. It tests both the positive and negative cases. The test cases that fail for OpenSSL 1.0.1-beta 1 pass for OpenSSL 1.0.1g. It’s small, it’s fast, and it didn’t require that much setup once I figured out all the parts I needed.

The complete instructions on how to build it and a description of the output is contained in the header comments. I built it on my OS X 10.9.2 system using Xcode/Apple LLVM version 5.1 (clang-503.0.40). Feel free to let me know if it doesn’t work on Linux or other platforms. Other constructive comments are, of course, always welcome.

I wish I could say I did this completely out of the goodness of my heart. But the truth is, I’m a vain, bitter, petty man, and I’ve got a bone to pick. While I’m grateful to my friend for submitting yesterday’s AutoTest Central-syndicated Heartbleed post to Reddit, the fact that it got downvoted out of existence and inspired the following useless and harmful commentary really got under my skin:

ruinercollector: So many Monday morning quarterbacks.
Synackaon: I hate them too

Hollow posturing that contributes no social value whatsoever such as this is exactly why I’m so skeptical of low-/no-barrier public forums like Reddit, and why I prefer forums where there are no shadows in which anonymous cowards can hide. However, were I the troll-feeding type, maybe I’d say something like:

Monday morning quarterback? Try retired Super Bowl Champion!

or:

So many jackass cowboys who’d rather excuse and dismiss catastrophic failures than help solve the problem of making sure they don’t happen to the extent humanly possible.

Guess I’ve always been a bit thin-skinned and grumpy. Luckily that doesn’t ever seem to stop me! And what’s more, there’s nothing quite like having working code in your pocket to come back at everyone who says unit testing for bugs like this and the Apple SSL bug can’t be done. In other words:

QED, bitch!