Mike Bland

Instigator

Haters Gonna Hate

Fighting back against critics of automated developer testing in light of the Testing Grouplet et. al.'s impact on Google engineering

- Brooklyn
Tags: Google, grouplets, Jimi, Testing Grouplet
Discuss: Discuss "Haters Gonna Hate" on Google+

Taking a quick break from the “whaling series” to call bullshit on those who dismiss automated developer testing as universally unnecessary and wasteful and/or characterize those of us who advocate automated developer testing as extremists, and to set the record straight about those of us who participated in the Testing Grouplet, the Testing Grouplet’s Test Certified program, and the Test Mercenaries. Typically I prefer not to dignify arrogant dismissals with a response, or take offense to overly-strong language that risks drowning out a perfectly valid argument, but I’m feeling equal parts grumpy and verbose today, so I’m ready to spit out a few of my own fighting words. To be fair, I’m certain there are far more folks I could pick on here, but I chose these just because they drifted into my awareness at some point in time and particularly got under my skin for one reason or another.

And, yes, I realize most of these articles that I cite are several years old, that my responses are a bit late. What can I say? I was too busy at the time writing and testing code, and/or helping change Google engineering culture, to allow such silly things to distract me. But now that I’m semi-retired and waiting to start a new phase in life, I’ve got plenty of time to get pissed off and vent a little whenever something comes along and tickles my trigger. Though, honestly, I think this outburst will release enough pent up pressure to keep me calm for a while.

Confirmation Bias
Misleading Example
Alarmism
Passing Misrepresentation
Why the hate?
Why the love?
Conclusion
Footnotes

Confirmation Bias

Joel Spolsky is well-known for his extensive history of online essays on matters of software development, aka Joel on Software. Much of what he has to say is of value, but his consistent contempt for unit testing is as frustrating as it is well-known. At least, as of 2009 it was; a Google search for "Joel Spolsky unit testing" doesn’t turn up anything more recent. Maybe he’s changed his thinking lately, but if he has, his reflections on any such change-of-heart either remain unpublished, or have not garnered as much PageRank. I stopped reading Joel’s articles regularly, and am too lazy to do more research than a quick search in my moment of agitation. But the top hit for that search is From Podcast 38, in which he seems to conclude, emphatically as ever, that because unit testing could be misapplied and overdone and abused, it’s best avoided altogether. His companion in the podcast, Jeff Atwood, actually supports people practicing automated testing to the extent it appears to help them adapt the code to evolving business needs. Yet he also takes a position complementary to Joel’s:

And I don’t want to come out and say I’m against [unit] testing, because I’m really not. Anything that improves quality is good. But there’s multiple axes you’re working on here; quality is just one axis. And I find, sadly, to be completely honest with everybody listening, quality really doesn’t matter that much, in the big scheme of things…

I’ll revisit this assertion later. But even more than this podcast, the one post demonstrating Joel’s anti-automated testing sentiment that torques my jaws the most is his paean to Jamie Zawinski, The Duct Tape Programmer, in which he quotes from Peter Seibel’s interview with Jamie from Coders At Work:

[unit tests] sound great in principle. Given a leisurely development pace, that’s certainly the way to go. But when you’re looking at, ‘We’ve got to go from zero to done in six weeks,’ well, I can’t do that unless I cut something out. And what I’m going to cut out is the stuff that’s not absolutely critical. And unit tests are not critical. If there’s no unit test the customer isn’t going to complain about that.

On the surface, this jibes with what Joel and Jeff both had to say in the aforementioned podcast. It does not appear to suggest that one should write unit tests unless you’re a Jamie Zawinski, but that Jamie didn’t need them, and if you want to be a rockstar like him—“My hero,” says Joel immediately before the quote—you shouldn’t bother, either. If that’s a misinterpretation, I’m open to correction; but I’m hardly alone in having that impression, and the paragraph immediately after the quote reads:

Remember, before you freak out, that Zawinski was at Netscape when they were changing the world. They thought that they only had a few months before someone else came along and ate their lunch. A lot of important code is like that.

So, what I conclude from that is, a lot of “important” code usually doesn’t have unit tests. So if you work on “important” code, or if you want to, chances are unit testing will be more trouble than it’s worth.

The punchline comes from what Joel chose not to quote, from the very next paragraph from the Coders At Work interview:

I hope I don’t sound like I’m saying, “Testing is for chumps.” It’s not. It’s a matter of priorities. Are you trying to write good software or are you trying to be done by next week? You can’t do both.

So, it seems, to Jamie, it was a conscious decision at the time to not write unit tests when developing Netscape—and I don’t find anything wrong with that. Netscape was racing to be the first and best web browser in the world, and started life with a very small, focused, close-knit team. With that few developers all focused on a single project, or handful of projects, working on brand new code that everyone is very familiar with, automated testing may provide more overhead than it’s worth, especially when there’s little precedent for the system you’re creating and you don’t know if someone’s going to beat you to the market and eat your lunch. Completely valid point, and Google did much the same thing for the first few years of its existence.

Also bear in mind it was the early-to-mid 90’s, before testing frameworks, tools, and knowledge were as widely available as they are today—in fact, it was a consequence of Netscape’s launch and subsequent legacy that such artifacts and knowledge could eventually be broadly developed and disseminated. Plus, Netscape Navigator, like most software of the day, was developed to run on a single desktop at a time, not provide highly-scalable, distributed backend services managing Big Data for millions of users—again, something that was a consequence of Netscape’s efforts, not a precursor. Insisting on thorough automated testing in that environment would’ve created agonizing overhead, before the scale of the company in terms of projects and number of developers demanded it. Despite that, Jamie does mention in the same interview that he did have his own hand-rolled tests for particularly important and tricky bits of code, and that he did write more tests when writing new classes for Grendel, the Java rewrite of Netscape’s mail and news client.

In other words, my beef with Joel’s article—with respect to unit testing, at least—is that he picked a quote from a bona-fide Programming Hero that supported his conviction that unit testing is a frivolous waste of time, without providing sufficient context for the quote that clearly demonstrates that the choice to skip unit testing was conscious, very context-dependent, and not a permanent and universal prescription on behalf of said Programming Hero.1 Peter Seibel himself felt compelled to address the controversy sparked by this specific misquote, and wrote a comprehensive online response, Unit testing in Coders at Work, clarifying the context of Jamie’s remarks and highlighting the thoughts of many of the other interviewees of the book, most of which are rather unambiguously in support of automated testing as an important tool to have at one’s disposal.

I do not quarrel with Joel’s assertion that automated testing doesn’t work well for him or his company, and I feel no need to convince him otherwise, as he has had demonstrable success without it for many years. I believe I understand his situation quite well: If you have a single engineering office with only a dozen or so engineers working on a small handful of projects, and those engineers are highly-trained and motivated and all eat lunch together every day, maybe all your communication, coordination, and validation bases are covered, and you don’t need automated testing—or even code reviews, for that matter. Or at a company like his previous one, Microsoft, where the organizational structure—at least at the time, as far as I can tell, and feel free to correct me if I’m wrong—was not set up to promote extensive peer-to-peer interaction and continuous software integration, meaning that a team could largely put its head down and stay focused only on its own product largely to the exclusion of other development activity in the company, I can also see how you can get away without it up to a point.

And I also completely agree with his philosophy on coding standards, Making Wrong Code Look Wrong, where he not only makes an excellent case for programming idioms loaded with semantic implications as a means of detecting errors and ensuring code quality up-front, but also accomplishes the seemingly impossible task of justifying the original intention, at least, of the Apps-variety of Hungarian notation. The same principles, encoded in the Google style guides, carried Google through many successful years absent any serious application of automated developer testing, and are still critical to its continued success. I can’t imagine Google achieving the scale of successful C++ development as it has without such standards, even in the presence of widespread automated testing.

My quarrel is with his assertion that unit testing—and perhaps, though this may be reading too much into things, automated testing in general—is universally unnecessary on principle, based on his personal experience, his perspective, and out-of-context quotes cited as evidence confirming his stance. With hundreds or thousands of engineers across multiple sites and timezones, with more joining all the time, working on hundreds of projects, many of which are integrated with one another—even if those thousands of engineers are notoriously well-trained and capable—you reach a scale where things break down. Many people experience a similar breakdown on much smaller projects in much smaller companies, and have found the application of automated testing to help them immensely in recovering from such situations—as I did on a small team at Northrop Grumman Mission Systems. And based on my experience, and the experience of thousands of Google engineers in particular, experience quite different from that of Joel Spolsky’s, it is clear that automated testing is an incredibly powerful and helpful tool. It is not a silver bullet, but it is a bullet.

Another, secondary quarrel I have is with the tendency of Joel’s assertions, given his authoritative tone and his level of visibility in the blogosphere, to inspire “me too” proclamations—the Echo Chamber aspect of the Interwebs in full-effect.2 I won’t afford them the dignity of a direct link here, but they’re easy to find. The pattern is roughly thus:

Joel says unit testing is unnecessary. In my world of software development, I’ve had a mildly negative experience with unit testing, too, for this reason or that which amounts to Doing It Wrong, and have never seen an example of someone Doing It Right—or choose to wilfully ignore such. I can’t imagine another context of software development, differing from my own experience and point of view, whereby long-term product demands, ongoing development challenges, and/or company scale require the application of automated testing, or where tools, knowledge and training might make it more more feasible. Therefore, Joel is right, and I agree, and you shouldn’t waste your time with unit testing either.

Grrrr…

Misleading Example

Also addressed in Peter Seibel’s Unit testing in Coders at Work article was the issue of competing Sudoku-solver implementations by Google’s Peter Norvig and Test-Driven Development advocate Ron Jeffries. Norvig’s elegant solution, compared against Jeffries’s verbose and ultimately incomplete implementation, seemed to some a clear and harsh conviction of the wastefulness and inefficacy of Test-Driven Development (TDD) in particular—characterized as “test-first” development, where tests are written to nonexistent interfaces and/or behaviors, after which the production code is written to get the test to pass. Though TDD is a specific practice within the larger scope of automated testing, an indictment of TDD threatens to become, in the minds of those so-inclined, tantamount to an indictment of unit testing and, potentially, automated testing in general.

However, as Seibel points out with more context from Coders At Work, Norvig admits he frequently applies test-driven design, and that Jeffries’s problem wasn’t that he was testing, it’s that he appeared to substitute TDD for sufficient background knowledge of the problem itself. Seibel extensively analyzes the two competing approaches, and concurs with Norvig: TDD isn’t necessarily to blame; not stopping to think about and/or research a correct solution before applying TDD is.

Testing is a tool, not a solution, not a substitute for knowledge. It’s a tool to ensure the knowledge represented by the code under test is not corrupted by errors, either basic errors akin to typographical errors (e.g. off-by-one errors, mixing up variable names); logical errors due to fragile corner cases and unforeseen side effects; or larger system-level errors due to an incomplete understanding of the code, such as changes which break contracts or behaviors on which existing users of that code rely, or on which future users will come to rely. It’s a tool to minimize the time spent discovering, diagnosing, and fixing bugs, and validating those bug fixes, whether at the time the original author writes the code or through years and years of changes by different programmers adapting the code to fulfill new requirements. It’s a tool to help ensure development velocity and productivity continue to increase in the long-term by giving people who work with the code relatively high confidence that their new changes are not creating problems for anyone else.

Alarmism

I can’t remember how I stumbled on Dhanji Prasanna’s post entitled Unit Testing: A False Idol. Dhanji is a former Googler, but we were never acquaintances. This post, ostensibly about the dubious value of unit testing in particular—not automated testing in general, though I wonder if the subtlety is apparent to everyone—is a bit confusing to me. It has an alarmist, anti-testing-sounding title, bold assertions that some developers’ enthusiasm for testing “borders on religion”, and claims that unit test maintenance has proved a pain in his own projects because of the maintenance burden of mock objects.

Of course, the one line that raised my hackles the most was in the opening paragraph, a clear reference to the Testing Grouplet’s Test Certified program, and possibly to Testing Fixits as well:

…the code coverage metric is a prized goal, one which misguided engineering managers give out t-shirts and other pedestrian awards for. (At Google you similarly received certifications based on levels of coverage—to be fair, among other criteria.) This is a false idol—don’t worship it!

I will agree with the above on one point: T-shirts should be considered harmful. Aside from that, I take issue with Dhanji’s apparent compulsion to express otherwise reasonable points, which I interpret thus:

  • Apply unit testing in a reasonable fashion, but remain aware that it is not a panacea and you can take it too far.
  • Mock objects are not a catch-all solution; they can increase the maintenance burden of a system when misapplied.
  • Code coverage should not be an all-consuming, objectively absolute goal.
  • Consider whether higher-level integration tests (“medium” or “large” in Google’s Small/Medium/Large test size schema) would be better suited to the task in certain contexts.

using inflammatory imagery that equates unit testing with religion, and those who diligently practice such “religion” as fever-infected and hungry for “pedestrian awards”. True, Google culture in particular is very metrics-recognition-awards-promotion-centric, and the Testing Grouplet’s Test Certified program was designed with awareness of that priority/incentive structure in mind. But those of us—or, at least, most of us—at the core of the movement were always careful to point out that the Test Certified criteria, and in particular the code coverage goals, were just tools and signals intended to help the team achieve their overall development goals, not stand as primary goals for their own sake. The code coverage goals, especially, were rough guidelines that each team should consider and adapt to their realistic needs—plenty of teams achieved Test Certified Level Two with far less than 70% small test coverage.3

However—and this is the confusing part—in spite of the searing condemnation of those who enthusiastically advocate unit testing implied in the language of Dhanji’s post, the post still asserts that not all unit testing is bad, and that many Google products have benefitted from “rigourous, modular unit tests”, particularly some very popular products that rely to some degree on components he’s personally written. What’s more, as I learned from the “About Me” section of the blog, Dhanji is the author of the book Dependency Injection: Design patterns using Spring and Guice. As I’ve written earlier, dependency injection is a fundamental design-for-testability concept, and I plan to mention what little I know about Guice (being a Java tool) in my upcoming “whaling” post about Google testing tools. I’m confused as to why Dhanji felt motivated to rail so forcefully against the pitfalls of unit testing and those who are excited about it, while it appears he actually practices automated testing in a thoughtful, productive fashion.

Certainly some folks do go overboard in their advocacy of Test-Driven Development and achieving coverage goals. They should be called on whatever irrational behavior they are bold enough to display and impose upon others. Testing is not a religion and should not be practiced and promoted as one—but neither should it be persecuted like one. I do not think it’s useful to use the same extreme language, either in support of testing or against people who push it too hard, in addressing the core issue: What’s the best way to write automated tests for a particular piece of code? Polarizing the community into camps of religious fanatics and, by implication, those who are more enlightened and responsible—and intelligent, good-looking, better people, etc.—only serves to make rational, productive debate and positive change less achievable, as sides are taken, heels dig in, and invective impedes and erodes communication, critical thought, and mutual respect. Have we learned nothing from the current American political climate?

In the end, my take on this article is that Dhanji has some interesting and valuable things to say about the applicability, effectiveness, and pitfalls of various testing approaches and their alternatives based on his personal experience. He speaks with some degree of authority about these things. I’m just frustrated and sad that his blog post resorts to sensational accusations and conclusions to present his theses, denigrating his fellow programmers and misrepresenting their intentions in the process, specifically of those who worked very hard to improve the overall productivity of Google engineering via increased adoption of automated testing—and who, by extension, made fundamental contributions to the continued success of the company as a whole.

Passing Misrepresentation

It was one small detail of a recent Google+ post by Steve Yegge, a Googler who’s very well-known for his public “rants”—his word, not mine—that I stumbled upon late at night and completely by accident that provided the last straw which inspired this rant of my own. He and I have never met or otherwise communicated.

In fact, I was only skimming this particular post, not really reading it, when this tiny passage caught my eye:

1. Software should aim to be bug free before it launches. (Banner claim: “Debugging Sucks!”) Make sure your types and interfaces are all modeled, your tests are all written, and your system is fully specified before you launch. Or else be prepared for the worst!

“Debugging sucks” is famously the first half of the Testing Grouplet’s official motto, “Debugging sucks. Testing rocks.” This motto is embedded in the Testing Grouplet’s “lightbulb” logo, appearing right at the top of every episode of Testing on the Toilet (TotT), which you can see very clearly in the images in my Testing on the Toilet post. Steve does not appear to be at all against unit testing, and honestly, of all the things that irked me enough to write this post, this passing comment ranks squarely at the bottom. Still, it evoked enough strong, cascading emotion I feel compelled to clarify what “Debugging sucks” really meant.

The overall thrust of Steve’s post, as I understand it, is that many software development debates can be framed in terms of the conservative vs. liberal political spectrum. His point in making the “Debugging sucks!” is to imply that there is a population of engineers firmly at the “conservative” end of the spectrum who are risk-averse to the point of wanting absolute confidence that any released software be bug-free. I agree that this may be a valid characterization of one extreme of software development philosophy. My personal objection is that this was absolutely not the attitude of the Testing Grouplet at all, and I somewhat resent that part of the Testing Grouplet’s motto has been misapplied to represent this extreme.

“Debugging sucks” did not mean that the Testing Grouplet had any illusions that there was such a thing as “perfect” testing and “perfect” design that would result in “perfect” software being released, completely free from bugs. The full motto—“Debugging sucks. Testing rocks.”—means that testing is generally a more effective and productive use of development time. No one argues that all bugs can be found before a release, by means of testing or anything else, but that a lot of them can be found and fixed early by applying a reasonable amount of testing (and code reviews, etc.), minimizing the amount of time spent debugging and the risk associated with production bugs—which, ultimately, means preserving reputation and, consequently, revenue, though it’s always been a challenge to draw a straight line between automated tests and the bottom line.

“Debugging sucks” if that’s what you spend all your time doing because you have no tests catching all of the easy and stupid bugs up-front. In fact, in my experience, diligent application of automated testing actually helps debugging rock, because it’s usually only the trickiest and most interesting bugs that are able to navigate through a decent-sized testing net and manifest in a released product. And if you aren’t swarmed by a bunch of boring little bugs, you can better focus on the ones that do appear and appreciate their beauty—before killing them swiftly, with extreme prejudice, and supplying a new regression test to plug up the hole they crawled through.4 At that point you can feel like a hero—since, as Steve claims as representative of his hardcore software liberal stance, “Bugs are not a big deal!”—and get back to the work of writing interesting new code that will, hopefully, get you recognized and promoted.

Actually, as Steve requests in his final wrap-up, I don’t take issue with the fact that “Debugging sucks!” is applied to a “conservative” viewpoint. Insofar as the Testing Grouplet was advocating a technique of risk mitigation, rather than absolute risk aversion, it applies. But I still feel that the association in Steve’s post risks misrepresenting and trivializing the actual intent of the motto, and the actual impact that those of us that made that “banner claim” were striving for and actually achieved. I’m fairly sure that’s not how Steve meant it, but I can’t help but clearly challenge the “secondary” implications of such an inaccurate association made in the service of some other issue.

As an unrelated side note, his next “conservative” principle is this:

2. Programmers should be protected from errors. Many language features are inherently error-prone and dangerous, and should be disallowed for all the code we write. We can get by without these features, and our code will be that much safer.

I consider myself generally very liberal in this respect, except one: Implementation inheritance should be considered harmful, and I’m tickled pink that Go left inheritance out of the language completely.

Why the hate?

So why do some programmers, of various degrees of visibility, feel compelled to openly assert that automated testing is of dubious value on first principles, or interpret the enthusiasm of those who apply automated testing as ideological extremism? I am not a psychoanalyst, or a sociologist, and am well aware that, with what I’m about to say, I’m making a claim that is not supported by rigorous scientific validation. However, my experience in the tech industry and with life in general leads me to this hypothesis5: A large percentage of humans, programmers included, are often compelled by a need to sound authoritative on subjects of their choosing, even absent actual experience, having only a very limited experience, or without granting that different tools and methods work for different folks in different contexts. Their way is the One True Way, and woe be to the unbelievers. This is one of my least favorite qualities about many, many humans, but especially about programmers.6

It’s not just testing that evokes such strong reactions on both sides; programmers will get into it over the choice of programming language, the choice of code editor, the choice of operating system, and how one chooses to format the code itself. Though it seems most programmers claim to be athiests and/or scientists, perfectly rational beings swayed by logical argument and emperical evidence alone, many of them act mighty irrational and zealous with alarming frequency, and are easily threatened by ideas that do not fit into their current worldview—and the people that espouse them. The fact that they work with code, with machines, is just a shield, a curtain which thinly conceals their own human nature and sense of insecurity.

That, and, well, yeah, some folks can be a little too pushy in advocating their methods and perspectives. Those of us who have “gotten” automated testing can be enthusiastic about our personal success with it, especially if we’ve had the death march project experience in contrast, and we run the risk of smothering others with our enthusiasm. That usually only helps to push away the very folks you’re hoping to persuade, and I’ve possibly crossed the line a few times myself.

Eventually, though, I learned that one succeeds in winning undecided and contrary-leaning hearts and minds not by pushing or pulling harder and harder—at least not directly—but by seeking out kindred spirits and demonstrating the value of one’s perspective and approach, refining concepts and tools along the way, until folks naturally gravitate towards your perspective.7 Throughout my blog posts about the Testing Grouplet, et. al., at Google, I’ve attempted to illustrate this principle in action, how a band of passionate advocates with zero authority to mandate company policy tried many ways to saturate the culture with our message over the course of years—and eventually succeeded. That happened because, ultimately, what worked for us eventually worked for a lot of other people, based on the merits of the approaches and tools, and their impact on productivity, not based on direct argument and compulsion.

In the case of passing, inaccurate associations like the one Steve Yegge made, well, I don’t think it was outright, deliberate hate and insecurity. I think he just reached for an association out of convenience, and didn’t think through the implications. We’re all guilty of that sometimes, too.

Why the love?

Why am I so sensitive about these things? Why have these testing-hostile remarks given me such a severe case of the Mondays?

I came to my own personal testing-Jesus not at Google, but while working on a small team at Northrop Grumman Mission Systems that had inherited a death march project. After barely making an acceptable release—without any team members murdering each other, I’m proud to say—we were given the time and freedom by the customer to just make any improvements we could in terms of speed and stability, and in that time I began my first experiments with unit testing. Just a few months thereafter, both the speed and reliablity of the product were vastly improved, and I was able to add powerful new features with minimal defects that were quickly diagnosed and addressed. I attribute much of that success to the process of writing unit tests for myself as I wrote the code.

My point being, most of what I’ve written in my blog up until now has emphasized the importance of automated testing for an organization of Google’s size, for developing systems of Google’s scale and complexity. But, even on my relatively tiny project and team, long before coming to Google, I found immense value in practicing unit testing for myself. That experience is what convinced me that, given Google’s accelerating rate of growth and complexity, widespread adoption of unit testing, and automated testing in general, would be essential to maintaining its productivity and continued success. I found like minds in the Testing Grouplet, and we made widespread adoption of automated testing happen. Slowly, but it did most definitely happen.

So, yes, I care about this stuff because I’ve been invested in it for so long, it has worked very well for me over the years, I’ve seen it work very well for many others over the years, and I am immensely proud of the success of the Testing Grouplet and its allies in transforming Google engineering culture to one that values and practices thorough automated testing. From thoughtless generalizations to misquotes to bad examples to opportunistic alarmism to outright denial of the value of automated testing, I find all such statements as offensive as they are ill-informed and/or poorly thought-through. While I was still working at Google, I didn’t have the time or motivation to respond to such negative propaganda, but since I’m standing on a pile of soapboxes reaching to the moon these days, spreading the knowledge of the Testing Grouplet’s methods and achievements beyond Google’s borders, I thought it apt to apply my own voice in contradicting such misinformation.

Perhaps surprisingly, though, I must confess I’m not a pure Test-Driven Development practitioner or advocate, nor am I much of an Agile software development practitioner. I’ve very rarely written tests before the actual code, very rarely pair-programmed, participated in relatively few stand-up meetings, etc. I think it’s fantastic that folks that do these things and get immense value from them, and I am willing to play along with anyone who operates in such an environment, but those were never exactly my things. For me, the biggest bang-for-the-buck always seemed to come from designing for testability, writing the tests immediately after writing the code, ensuring a battery of appropriately-scoped tests were in place over time, and performing continuous integration and testing. That, and code reviews. Everything else is gravy to me, really.

And, yes, sometimes I’ve overcomplicated things in pursuit of testability. Thankfully, other talented engineers who performed thoughtful code reviews were there to pull me back from the edge whenenver I went too far—but those times were rare. More often than not, particularly while serving as a Test Mercenary, and also during my time working on websearch, I was often influencing my peers as to how to better structure their code and tests.

Conclusion

The evidence, at least anecdotal, is overwhelmingly in favor of the claim that, in many contexts, automated developer testing provides immense productivity benefits, both in terms of accelerated development and averted catastrophies—both of which are, unfortunately, hard to measure, though easy to experience. One can get it wrong, by doing it half-heartedly, by not having sufficient tools or training to do it right, or by going overboard and obsessing over code coverage for its own sake. That doesn’t mean automated testing can’t be done right, that how you’re doing it now can’t be done better, and that there’s no value in the practice at all.

And, of course, unit testing or automated testing in general is not a replacement for thinking through the issues a piece of software is trying to address, nor for any other development productivity or quality-assurance practice; it is another spice to add to the mix. The right mix of automated tests and sufficient code coverage at all levels—unit, integration, system, etc.—can eventually make better use of your manual testers’ time and talents, in addition to your own as a developer.

One is perfectly justified in claiming that, in one’s own experience, one has not received sufficient benefit from the practice of unit testing, or automated testing in general, to justify adopting it as a regular practice oneself. One is justified in claiming that, on one’s own project, other factors have taken priority and there is no time to test, especially if one is shipping and/or maintaining a functioning product with enough support from other tools and practices that automated testing is not missed. It is not justified to claim that, based only on one’s personal experience and opinions, automated testing is a waste of time with dubious benefit and should be generally discouraged. Clearly many people feel it helps their productivity and quality of life—including most of Google engineering these days—or else they wouldn’t waste their time doing it.

Allow me to draw a parallel with another domain: I’m a Fender Stratocaster man myself, and have my reasons for it8, and fully advocate that anyone who thinks a Strat might raise their level of play or enjoyment should give one a try. But just holding a Strat won’t turn you into Hendrix—oh, how I wish it did, sometimes!—and I still love the sound of a Gibson Les Paul in the hands of a master—I happen to own one, though I play it infrequently—and would find it absurd to label Les Paul players as misguided, cultish, and considered harmful. Programming is art as well as engineering; do your thing your way, and feel free to try to persuade others to try your way, but don’t shit on what works for others because you have your reasons for doing things differently.

Coming back around to Jeff Atwood’s point from almost the very beginning of this post about code quality not mattering that much: Well, that may always be true from an end-user’s perspective, or from the perspective of small, tight-knit teams working on very few products—but I challenge him to make that assertion about a programming environment like Google’s.9 When you have engineers who have never met, perhaps never will, don’t speak the same native languages, have completely different cultural backgrounds, operate in different time zones, and yet integrate thousands of code changes every day to support a broad portfolio of powerful, complex, and extremely popular products, code quality and automated testing begins to matter a whole lot. And that’s just in Mountain View.

Footnotes

1Jamie wasn’t a fan of Joel’s article, either, albeit for different reasons; see his response.

2As opposed to the Melting Pot effect, which was ostensibly the dream of all this technology providing access to greater information and multiple viewpoints that would prove otherwise inaccessible. Technology can be a powerful force for positive changes in life quality for huge portions of humanity—but it does not guarantee Utopia.

3The Google Web Server team, the poster child for productivity unleashed by the diligent application of automated testing, whose tech lead was the Testing Grouplet founder, and on which Test Certified was modeled, never formally achieved Test Certified Level Two, and certainly never approached 70% small test coverage. That particular figure never made sense for GWS.

4I actually plan to describe a couple of such bugs I personally squashed in Google websearch systems in my next “whaling” post. No confidential stuff, but I’ll be plenty descriptive.

5Note I’m careful not to say “conclusion”.

6And, yes, I realize that I proved my bit a bit of a hypocrite in this regard in the paragraph immediately preceding this one.

7As I mentioned in a previous post, Geoffrey Moore’s Crossing the Chasm explains this phenomenon much more thoroughly and objectively.

8Of course, I don’t play a Strat because Jimi played one, but because I love the sound and the feel of playing one—but I try to evoke Jimi to the extent I’m able very often when I do play. If you watched that video, you also have to watch Jimi’s Dick Cavett interview discussing the performance. And this clip of Little Richard’s recollection of Jimi. Just ’cause.

9Jeff appears willing to admit context and scale may influence one’s choice of appropriate software development practices in general; see his September 2007 post, Steve McConnell in the Doghouse. He also sounded very pro-unit testing in April 2005 (Good Test/Bad Test) and July 2006 (I Pity The Fool Who Doesn’t Write Unit Tests; the title is actually misleading, in that he stands in support of unit testing but opposes the denigration of those who have yet to adopt it). In fact, he said lots of great things that sound very similar to what I’m repeating over and over again here in my blog—and which even mirror my personal attitude towards testing that I describe in this very post.