Mike Bland

Testing on the Toilet

The Testing Grouplet's weekly publication for spreading testing news and views throughout Google, in the most opportune of places

- New York
Tags: Google, Test Certified, Test Mercenaries, Testing Grouplet, TotT, grouplets

Testing on the Toilet (aka TotT) is the most visible achievement of the Testing Grouplet. Starting as an offhand joke during a meeting, TotT has become a genuine Google institution since 2006, and is one of the most effective ways to spread ideas, generate discussion, and drive new internal tool adoption within the company. As the Washington Post noted in the article Building a ‘Googley’ Workforce, “To understand the corporate culture at Google Inc., take a look at the toilets.”

Testing on the Toilet Episode 49, "Test Certified: Lousy Name, Great Results"
original image source
A TotT episode on Test Certified (which I happened to write!) which also features the Test Certified and Test Mercenaries logos designed by Mamie Rheingold. The Testing Grouplet Logo by Johannes Henkel (with a small contribution from yours truly), the episode number, the advertisements and bold links at the bottom are all part of the standard TotT template, originally drafted by Flavio Lerda.

This post tries to cram a whole bunch of details into a single context. To help digest, I’ve broken it into smaller bites, and have sprinkled images of TotT episodes taken by visitors to Google’s engineering offices that I’ve found around the web:

A much abridged version of TotT history exists on the Google Testing Blog. It hits many of the high points; think of that post as the toilet itself, and what follows in this post as the plumbing—or maybe the contents.

She Came In Through the Bathroom Window

TotT was born in a Testing Grouplet brainstorming session in early 2006. Those of us in the grouplet had long been wrestling with the question of how to transmit to the rest of Google engineering the benefits of unit testing, and the knowledge needed to do it well, in order to promote widespread adoption of the practice. Experience was absolutely the most convincing form of persuasion, but at that point we hadn’t the means to scale direct experience—Test Certified and new and improved build and test tools would eventually help with that; the Test Mercenaries was another attempt that didn’t scale as we hoped, but provided a different value in retrospect.

Also at the time, those of us in Mountain View had made a few connections with like-minded folk in the European offices, and we were struggling to discover ways we could all be involved and productive together. Some of our first trans-Atlantic videoconferences had been disheartening failures, despite the best intentions of all involved, and our good rapport offline. In particular, Ana Ulin and I met while taking an Industrial Logic Design Patterns workshop in Mountain View and became good friends before she returned to Zürich, at which point we started having long instant message conversations on a near-daily basis about all things testing-related and not. (And during the course of one of our chats, she taught me the value of Safe Search, which I think I’ll relate in a future post.)

At some point during the brainstorming session, Bharat Mediratta, who led the Testing Grouplet at the time with Nick Lesiecki, threw out the idea of maintaining a lending library in the bathroom stalls, since there appeared to be a widespread interest in unit testing books at the time. (Really, I think people just liked collecting free books, not necessarily reading them, and we’d already given out hundreds without yet realizing this.) At this prompt, Antoine Picard noted that at his previous job at Adobe, some jokesters got the itch to post a series of Poetry in the Pissoir (or some such) flyers in the restrooms. Bharat then made a comment about how we should just post flyers in the bathrooms, since we know everyone’s going to have the time to read. We all laughed a lot, not taking the joke too seriously, but put it in the official meeting notes anyway, because we were fun like that. (I believe JB Steadman was taking notes that day.)

A few days later, after having read the meeting notes, Joe Walnes in the London office decided he’d take the initiative to produce and post the first TotT. Antoine Picard sent him a draft of an article that he’d written on better stubbing in Python, and Joe put up the first flyers in London. Though I don’t recall the exact order of events, eventually the Testing Grouplet community produced a few more articles, and Ana began piping up on the Testing Grouplet mailing list, demonstrating enough interest and high-level thinking on the subjects of content solicitation, editorial review, and volunteer distribution throughout all Google engineering offices that she became the first official TotT Coordinator, setting the standard for the operational model that exists to this day (as far as I know).

Note this: While folks from the headquarters in Mountain View were involved, the folks who took the ball and ran with it, right into every bathroom in every Google engineering office in the world, were located in offices outside of Mountain View. Without explicitly trying, we stumbled upon one of the key mechanisms that allowed the Testing Grouplet to operate as a truly functional multi-site volunteer effort: Create a project comprised of discrete tasks that can be pipelined and executed in parallel. Episode authors can be from anywhere, multiple episodes can be in the pipe at any time, anyone on the mailing list can help edit whenever they can, and two volunteers (presumably, unless the bathrooms are unisex, like they were in Trondheim) for each floor of each building of each office can post episodes. The only lock that the whole process depends on is the coordinator’s explicit approval of the episode to be published each week. Test Certified and Fixit organization eventually followed the same distributed, discrete task-driven model. Seems trivially obvious now, and certainly that’s how any successful multi-site business (or multithreaded program running on multi-core processors) operates. Funny how, as a volunteer organization, this blatant truth remained hidden to us until good intentions lost focus and concrete action took shape.

Mid-2006 pre-standard template Testing on the Toilet, "JavaScript: Simulating Time in jsUnit Tests"
original image source
Mid-2006 pre-standard template TotT

May I Introduce to You the Act You’ve Known for All These Years

After Ana, David Plass in NYC took the coordinator role for a year, followed by Emily Johnston in Mountain View, and it is now coordinated by Jim McMaster in Boulder. Also worth noting is that throughout its history, Chris Lopez in Mountain View (he of Chris/Jay Continuous Build fame, which I’ll describe in the future) has provided the lion’s share of editorial feedback and assistance; in fact, most episodes won’t go out the door unless the coordinator knows Chris has taken a pass first.

Testing on the Toilet Episode 42, "Contain Your Environment"
original image source

Hello Goodbye

Back to the story: After the first few episodes started coming out, we started hearing a bit of a backlash. Some folks were outraged that we had violated their sacred bathroom space, crossing some arbitrary line of decency. However, the biggest backlash was that we weren’t producing enough new content quickly enough; people would complain that they’d been reading the same episode for two-and-a-half weeks, and they were hungry for something new.

Outside the company, as some reports of TotT began to leak out, many found the TotT phenomenon to be interesting and amusing and revolutionary; others put on their tin-foil hats and assumed it was a mind-control exercise by the top brass:

“Somehow, though, this particular example, where one cannot escape one’s work even in the bathroom, strikes me as bordering on obsessive / compulsive, and seems almost Orwellian.”

“Despite the veneer of amiability about this project, I find it faintly disturbing. Why? I think it’s the attempt to work the job and group mindset into every part of an employee’s day and life…. I think what really bothered me (in this context) was the group-oriented friendliness. It took me a while to figure out why, then I realised that it was faintly reminiscent of a cult.”

There was one Google-internal expression of dissatisfaction that actually became a permanent part of the publication: One day, someone had gone into all of the men’s rooms in Building 43 and taped a bunch of unit test tool ads to the bottom of all the posted episodes. People started asking on the Testing Grouplet mailing list (or maybe we had a TotT-specific list by then?) about where these came from and what they meant. Being the grumpy sort at this point, I guessed that someone was protesting against our flyers by trying to equate them with ad spam, and that I thought we should just put the damn ads on the flyers ourselves to save them the trouble.

An hour or two later, maybe quicker, Flavio Lerda produced an Open Office template that has become the standard format for every episode ever since. It contains the Testing Grouplet logo (which doesn’t actually say “Testing Grouplet”; more on the logo in a future post) in the upper left corner, the episode number in the upper right corner, and two ads at the bottom resembling Google search result ads that point to further material, internal or external, pertinent to the topic at hand.

Testing on the Toilet Episode 88, "Testing against local MySQL"
original image source

Here, There and Everywhere

In the beginning, it was mainly members of the Testing Grouplet who wrote episodes. I wrote five or six back in the day, trying to keep up with Antoine Picard, but he kept going even after I stopped. We wrote a great deal about about dependency injection, testing techniques, the benefits of test coverage measurements, what makes for a good test vs. bad, and the available testing frameworks, particularly Zhanyong Wan’s in-house gUnit and gMock frameworks for C++ (now open-sourced as Google Test and Google Mock). Eventually, folks from across the company started volunteering to write episodes, to share their own nifty techniques, or practical insights, or tools they’d developed which might be of use to the broader engineering community. This helped ensure a much more robust flow of ideas into the pipeline, which was important to ensuring that those of us in the Testing Grouplet weren’t just operating inside an echo chamber.

Getting folks to edit episodes never seemed to be much of a problem; the TotT Coordinator was there to ensure each submission made it into the queue and received the attention it needed to become publication-ready. Physical distribution of episodes posed another challenge, but one which eventually did work out. At first, a few of us would run around the offices in Mountain View after-hours when the place wasn’t so crowded to pull down old episodes and tape—yes, tape—new ones to the walls and stall doors. In Zürich, Ana filed a ticket for permanent plastic flyer-holders to be installed and just handed a stack of fresh episodes to the custodial staff to post each week. Eventually all the other offices also installed flyer-holders, which made the job quicker and easier, and in some places, folks would just post the new episode in front of the previous one, building up a small library of a few weeks’ episodes at a time.

Ana established the original global network of volunteers to regularly post episodes in every office. Usually the first to complain in each office/building/floor that their bathroom wasn’t receiving updates or was, even worse, TotT-free, would be asked to step up and become the volunteer for their location. This recruiting strategy worked quite well, as word spread through the company and folks didn’t want to feel left out, particularly in offices outside of Mountain View.

In Mountain View, which was by 2006-2007 already relatively large and growing very rapidly, while we were building up the regular volunteer network, Michelle Levesque visited the end of the weekly Noogler introductory lecture on unit testing (produced by the Testing Grouplet in conjunction with EngEDU, the internal training organization) to conscript brand new employees into the Noogler Army on a weekly basis. The Noogler Army’s mission was to have each member return to his/her building and post the current week’s episode; originally we offered books on unit testing as a reward for this service, but once the August 2006 Testing Fixit had passed, we often gave the conscripts T-shirts. (This kept going for a while; not sure when it stopped. Unfortunately, we never ran out of T-shirts.)

Testing on the Toilet Episode 124, "When coverage is not enough"
original image source

Don’t Let Me Down

After the first couple of years, the pipeline would at times get dangerously low on content; such is the nature of a 20%-volunteer effort, when people give what they can, when they can. Fortunately, most of the time the pipeline runs a little dry, momentum is maintained by publishing two alternate TotT brands: A Blast from TotT Past and Testing on the Toilet Presents. A Blast from TotT Past came about when we realized that not only was the pipeline running low, but a very large number of engineers had joined the company who hadn’t been exposed to the earlier episodes. By republishing past episodes, Nooglers could encounter ideas and tools they possibly hadn’t been exposed to before, and the remaining Googlers might be inspired by the same information in a new way, given the gulf of experience between the time a given episode first appeared and the time it was republished. Since these repeat episodes are only published occasionally, everybody stays happy.

A (not quite) Blast from TotT Past Episode 140: "The Times, They Are A-Changin'"
original image source

Give Peace a Chance

Testing on the Toilet Presents was a very positive response to the oldest form of flattery: imitation. Everybody knows that when a good idea takes off, people looking to replicate its success tend to reuse the same form factor as its original implementation. The Documentation Grouplet produced Waiting on the Water episodes posted in the microkitchens. The Hiring Grouplet posted Hiring on the Table episodes in the cafeterias. (I kept trying to get them to call it Hiring Under the Table. They never went for it.) And many other groups who wanted exposure were more than willing to write a TotT episode about their particular topic or area of interest—the only catch being that they had to work testing into the content somehow.

But what posed a genuine problem for TotT was the introduction of new publications such as Production on the Potty, which ate into our long-cultivated publication spaces. Spirited discussion ensued. Some argued that we needed to be somewhat militant in “defending our space”, actively tearing down competing publications wherever they were encountered. There was a misguided attempt to write a cease-and-desist letter, whose delivery basically led to the entire European continent laughing at us (the PotP folks were based in Zürich, or maybe Dublin, as I recall). Meanwhile, I kept repeating that I found it ridiculous that we would get into a turf war with our colleagues over a bunch of bathroom walls. Though they were in some sense infringing upon space we invested so much time and effort to legitimize, they were also paying tribute to our success, and there had to be a more imaginative response than running around tearing things down.

TotT Coordinator David Plass produced the compromise: XotT (where X is an algebraic variable representing whatever the topic-of-the-week would be). Eventually this name would be replaced by Testing on the Toilet Presents. In either case, the idea was that rather than fight these groups who wanted a cut of the action, why not work with them by providing the chance to publish as part of TotT proper, with one such slot available every few weeks? This ameliorates the empty-pipeline problem to some extent for TotT, gives those other groups rigorous editorial review and a much broader and scalable distribution mechanism, and gives everybody the chance to find out about other interesting engineering developments outside of the testing world in a polished and accepted format. Everybody wins.

Testing on the Toilet Presents Episode 83, "Findability in the Facilities: Better Search: Found"
original image source

Do You Want to Know a Secret?

Speaking of editorial review, an amusing note on confidentiality: Each episode of TotT after a point got a very thorough editorial review, and one part of that review involved making sure that confidential information wasn’t getting leaked. We dotted our i’s and crossed our t’s, you betcha. However, many of our internal tools, while not of particular interest to the outside world, were not what anyone considered confidential. And certainly no one, not even all the way up to Eric, Larry, or Sergey, could miss the fact that we were publishing all kinds of information on internal tools all the time.

Still, there’d be the occasional well-meaning Noogler raising the alarm bells on something he/she just read in the local stall. One particularly amusing incident was when I received an email forwarded from my good friend and Noogler classmate Damian Menscher, a member of the network security team. It was a long thread starting with one of said Nooglers’ alarm bells, bubbling up through several layers, and ultimately reaching the head of physical security, who questioned whether the head of Engineering Productivity, Patrick Copeland, was aware of such leakage. I replied-all on the thread to reassure everyone that, yes, Patrick and everyone else in Eng Prod and the Testing Grouplet were more than aware that, yes, we had posted an episode discussing SrcFS in a publicly-accessible place, and that no one considered it a problem. Rob Peterson, the Build Tools manager at the time, who I’d cc’d, also followed up and reassured everyone that all that had been written was that Google had found a way to scale its Perforce usage, nothing more. Curiously, there were no replies after that. And TotT still exists.

The Episode 74: SrcFS in 27 Easy Steps image I found does not permit downloading, so I can’t reproduce it here. You’ll have to follow the link to see what the fuss was about.

We Are All Together

OK. So what exactly would you say Testing on the Toilet did here? For starters, in addition to spreading awareness of the various testing techiques, tools, and terms, it really caused those of us in the Testing Grouplet to become more precise about our terminology and messaging, e.g. clarifying the difference between fakes, mocks and stubs; supplanting the misused (within Google) labels “unit”, “integration” and “system” for automated tests with “small”, “medium”, and “large”. There was much feedback and (ultimately) healthy debate about each choice of words, and readers often sent in suggestions for technical improvements—or in some cases, outright corrections. Many of the episodes and conversations were framed within the context of the Test Certified program, which provided a clear, direct path for engineers to take the new knowledge and tools, apply them to their own project’s code, and see a concrete improvement.

And of course, there was no tool quite like TotT to publicize an engineering-wide Fixit. With one episode, you could immediately get nearly every engineer in the company to know a Fixit was coming, why it was coming, what it was in it for everybody, and where to go for more info. For Fixits that introduced or focused on compatibility with new tools, such as the Revolution, Forgeability, and Test Automation Platform Fixits, there was no better way to get people excited about the new tools and get back to their desk or their team to start preparing for the event.

All of this combined meant that every Google engineer and team could stay in-the-loop regarding current testing and tool developments and share a common vocabulary, reducing friction to adopting regular testing practices that much further. Much like code readability standards that enable any engineer to make sense of any Google code without being thrown off by formatting differences, having TotT and Test Certified provide the framework and forum for testing discussions meant that engineers everywhere could share, contribute to, and benefit from ideas springing from conversations happening all around the world.

Testing on the Toilet Episode 23, "Understanding Your Coverage Data"
original image source

A Hard Day’s Night

How did we know when TotT had arrived, that it had changed the culture? For starters, when co-founder Sergey Brin publicly suggested to an engineer looking for culture-change guidance that his group could try doing something like TotT, that was a big moment. There were also various joke episodes that sprang up that, to this day, as far as I know, no one knows who wrote them, but they were damn funny. As mentioned before, we had many people outside the grouplet coming to us to publish their episodes, and we negotiated a lasting peace with other folks who were motivated to even try distributing their own publications over top of ours.

At the top of this post, I referred to the Washington Post article that suggested that TotT was a key to understanding Google culture. One blogger even postulated that the “this site may harm your computer" bug was a sign that TotT itself had failed.

However, the true indication of success, aside from the fact that TotT has published over 200 episodes and counting, is that automated testing is something that has become an expected part of normal development and release practice for all Google engineering teams. As I’ve tried to make clear through my series of blog posts, TotT alone wasn’t responsible for this sweeping cultural change, but it was the fulcrum that all the other levers—Test Certified, Fixits, tool updates—depended on to do their work most efficiently. It was, and still is, a true community effort that expanded, adapted, and evolved with the times.

Testing on the Toilet, "Bigtable Testing Made Easy"
original image source

Tomorrow Never Knows

Despite the profound cultural impact TotT has had, many folks don’t associate either it, Test Certified, many of the Testing Fixits, or other pretty visible artifacts with the Testing Grouplet. Old-timers certainly remember that the Testing Grouplet folks were the original troublemakers, but so many new engineers have joined the company that all of these institutions seem to them as separate, if symbiotic, phenomena. Ironically, that’s probably the most significant indication of the Testing Grouplet’s lasting impact.

As I mentioned in an earlier post, Ana, now in Mountain View, told me about a year and a half ago an amusing anecdote about a member of her current team, a TotT distribution volunteer, who had explained to Ana how TotT distribution works when Ana volunteered to fill in for this teammate one week. I laughed when she told me that.