Testing Grouplet

The grassroots volunteer team which made automated developer testing a core practice of Google engineering culture--and had fun doing it

27 Sep 2011 - New York
Tags: Fixits, Google, Test Certified, Test Mercenaries, Testing Grouplet, TotT, grouplets

The mission of the Testing Grouplet was to convince Google engineers to write testable code, and write more automated tests (e.g. unit, regression, etc.) as part of their normal development process. I co-led the Testing Grouplet from 2006-2007 with Neal Norwitz and Michelle Levesque, with massive support from Mamie Rheingold. This post covers the biggest components of the Testing Grouplet’s history as I see it; future posts will dive deeper into specific aspects, and perhaps flesh out more of the Testing Grouplet’s story.

I joined the Testing Grouplet not long after I joined Google in August 2005; it had only started that April or so, led by Bharat Mediratta and Nick Lesiecki. I’d been hired as a “Software Verification Engineer” into the Testing Technology team. My job description at the time mandated that I had to show Google engineers how to write better, more testable code—a daunting task for a nobody upstart stumbling through the door from a defense contractor in Virginia. (I got over this fear somewhat when I tried again later as a Test Mercenary.) When I learned about the Testing Grouplet, it sounded like the perfect place to bond with a broader community beyond that of my own team. Programmers, and Google programmers in particular, are notoriously difficult to persuade to change their tried-and-true habits, or to organize in any efficient, meaningful way—top-down mandates do not work—and here was a bunch of 20%-ers crazy enough to think they could change the way development was done across the company, without explicit executive support.

We had no authority, very little budget, but loads of attitude, latitude and creativity. We composed and presented Noogler (new Googler) lectures, to introduce the available tools and indoctrinate new hires early, and eventually more engineers had gone through the lectures than hadn’t. We had tech talks from both internal and external speakers. We had Beer & Ice Cream Socials—fun for a while, but not scalable. But what really won people over was the combination of Testing on the Toilet, Test Certified, tool support from Build Tools and Testing Technology, and testing and tools-related Fixits.

TotT is something many folks outside the company know about these days; starting as an offhand joke during a meeting, TotT has become a genuine Google institution since 2006, and is one of the most effective ways to spread ideas, generate discussion, and drive new internal tool adoption within the company. Joe Walnes picked up on the “joke” from the meeting notes and posted the first episode in London with input from Antoine Picard in Mountain View. Ana Ulin in Zurich became the first de-facto TotT “coordinator”, running the show for a year or so and setting the standard for the current operational model of soliciting volunteers to provide content and to distribute each episode throughout Google engineering offices. (Ana, now in Mountain View, told me about a year and a half ago an amusing anecdote about a member of her current team, a TotT distribution volunteer, explaining to her how TotT distribution works when Ana volunteered to fill in one week.)

Let that sink in for a minute: A joke in a meeting in Mountain View became an internationally-coordinated publishing platform throughout Google engineering—and was operated from outside Google headquarters in Mountain View from the start. That’s the magic of Google Intergroups, and that’s only part of the Grouplet story (though arguably the most visible).

Test Certified was a kind of twelve-step program (without twelve steps, exactly) for getting teams to improve their developer testing practices and test coverage. The Test Mercenaries were a team of full-time engineers usually deployed in pairs to other Google engineering teams to coach them towards improving their testing policies and practices, using Test Certified as a concrete roadmap. Future blog posts will discuss TC and the Test Mercs at length. For now, suffice it to say that TC did its job: Once the Testing Grouplet honed the Test Certified message and convinced the Test Mercenaries and Test Engineering to throw their collective weight behind it, it drove so much discussion, process change and tool development that its policies and practices became essentially standard procedure for most Google engineering teams.

I organized two company-wide Testing Fixits: One on August 3, 2006, and another on March 8, 2007. We did everything we could to get the word out about fixing broken tests and writing new ones, making the event exciting and fun, giving out all kinds of schwag and T-shirts. There was no specific agenda other than that; as such, we raised awareness that developer testing was something that more and more folks were taking seriously, even if not everyone was doing it yet.

Warning: Never, ever, EVER use T-shirts as Fixit schwag. I’ll explain later.

After a number of efforts, TotT reception and feedback, and a few Testing Fixits, the Testing Grouplet realized that we’d solved the problem of getting Google engineers’ attention; but we were hearing loud and clear that most of them thought they “didn’t have time to test”. To paraphrase Saul Alinsky from Rules For Radicals: If people believe they have the power to do the right thing, to change their situation for the better, they’ll do it. Otherwise they won’t. In that light, we interpreted this “I don’t have time to test” perception as a problem with the available tools, and we started working very closely with the Build Tools and Testing Technology teams, who in turn developed the most amazing build, integration and test system on the planet to make doing the right thing faster and easier.

Subsequent Fixits were less about writing new tests or fixing broken ones, but were extremely effective at driving awareness and adoption of the new build and testing tools: the Revolution on January 31, 2008 (organized by me); the Forgeability Fixit in November 2009 (organized by Rachel Potvin); and the TAP Fixit on March 6, 2010 (organized by myself and a host of regional organizers I’ll name later). Future posts will describe Fixits in general and these Fixits in particular, and the years-long “fixit arc” that emerged—or rather a “fixit circle”, as by the time of my departure from Google, there was a “Flaky Test Fixit” in the works: an honest-to-goodness, old-school Testing Fixit.

As history played out, it became apparent that the TotT + Test Certified

improved tools strategy worked: It took a few years, but we did it. In 2005, relatively few Googlers took developer testing seriously. By 2007, they were complaining that they just “didn’t have time to test”. By 2009, everyone had a local continuous integration and test system (the venerable Chris/Jay Build System) running on top of the new build tools. By 2011, with everyone using the centralized, ubiquitous Testing Automation Platform, whenever code is submitted that breaks anybody’s build, the author tends to hear about it within minutes, if not seconds, from TAP itself—and depending on the depth of the change within the infrastructure, possibly from dozens of vigilant build cops, both human and automated.

While this dynamic strategy of TotT, Test Certified, Fixits, and collaboration with Build Tools, Testing Tech, and Test Engineering—really the entire Engineering Productivity focus area—may have played out nicely, the important thing to remember was that from 2005 until 2007, we had no idea what the hell we were doing. We knew we were on the right path, trying ideas to find what stuck, and when the time was right, the long-term strategy revealed itself. But we didn’t start out with a grand strategy. We were entirely creatures of faith, faith borne of personal experience that had convinced us that imposing our particular idea of software development was the right thing to do if Google was to continue to hire more and more engineers to work on so many products across time zones and cultural boundaries, all working from a single (more or less) source code repository shared across the company.

Fortunately for us, Google wasn’t set up to allow anyone to impose anything on anybody. We had to persuade our fellow engineers, and to provide them with the necessary support to do the right thing, until they did it not because they were told, but because they perceived the tangible value in doing it for themselves. We had to commit for the long haul, in deed more than in word.

Granted, we may have had some outside help. As Antoine Picard pointed out to me, the tide of industry was turning more and more towards testing as an essential development practice, and Google started hiring more and more folks already inclined towards the testing philosophy. However, while that movement may have helped us reach the tipping point, it’s hard to say that alone would’ve caused the same change to happen for certain. Had the Testing Grouplet not existed and done what it did, any new hires arguably could’ve succumbed to the prevailing attitude that tests were a luxury at best and a waste of time at worst, rather than an essential development activity. It’s hard to imagine Google would have its current body of engineering knowledge with regards to testable design and coding practices, or the caliber of build and testing tools it has today, had the Testing Grouplet not had the gall to tell Google engineering it could do better—even though we were all figuring out the way how as we went along.