The 2007 Testing Fixit

The second Fixit I organized to promote widespread adoption of automated developer testing throughout Google development

23 Oct 2012 - Boston and NYC
Tags: Fixit Grouplet, Fixits, Google, Testing Grouplet, grouplets, schwag

The second Testing Fixit I organized took place on March 8, 2007. It wasn’t as easy as the first, the August 2006 Testing Fixit.¹ The first time was special, memorable, magical—I felt so connected, and loved. The second time was hard, exhausting work, and left me feeling lonely and borderline regretful. Still, it was a growing experience, preparing me for future situations, with their requisite complications, absent my immediate awareness.

The March 2007 Testing Fixit was not remarkable for its direct impact on the visibility and adoption of automated developer testing throughout Google development, but for several key insights developed as a result of the Fixit which led to advances in the organization of future Fixits, and which were eventually canonized and documented by the Fixit Grouplet, which I led from September 2007 until November 2008. In this way, the March 2007 Testing Fixit was a key component of the Testing/Tools Fixit Arc stretching from August 2006 to March 2010.

original image source
The Testing Fixit 2007 T-shirt and web application. Click for a larger image. Lifted this pic from the slides for Mark Striebeck’s 2011 Lean IT conference presentation. Yep, that’s me, dutifully walking away from a camera-toting MarkS in Building 43.

Let’s Have a Fixit!
Fuzzy is for Peaches, Teddy Bears, and Walrus Dolls
Task Menu
Scoring
Gladwell Mach I: Wire and Duct Tape
T-shirts
Scheduling and Fixit Fatigue
Fixit Day
Aftershock
Roles
Epilogue
Footnotes

Let’s Have a Fixit!

The August 2006 Testing Fixit turned my fortunes around at Google. I was promoted to become one of the Testing Grouplet co-leads, along with Neal Norwitz and Michelle Levesque, shortly afterwards, largely on the strength of my performance organizing the Fixit. Mamie Rheingold became a very close partner-in-crime, not just of the Testing Grouplet in general but of me personally, thanks to our Fixit experience and her increasing role in developing Testing Grouplet strategy and helping all of us execute on it. That November, I had already planned to eventually leave Build Tools to join the Test Mercenaries the following spring, and I went on my so-called Tools and Testing World Tour. On that trip, I gave Tech Talks on both Build Tools developments and Test Certified in several offices, leading to my introduction to David Plass in NYC, who would become another close partner-in-crime and the second Testing on the Toilet coordinator after Ana Ulin; and to the highest “official” recognition I received for my testing and Grouplet efforts, received some years later, for promoting the adoption of the Chris/Jay continuous build system via my European Test Certified talk, supposedly:

My very own Demy Award for promoting the Chris/Jay Build. Click for a larger image.
Trust me, it is mine, even though my name’s not on it. My name was on the sticky note stuck to the box in which it was delivered to my desk.

Still somewhat flush with the success of the previous Testing Fixit and all that followed, I proclaimed, in early 2007, it was time for another Fixit! We hadn’t had one in nearly six months, and considering how quickly and easily everything came together the last time, I imagined that at the very utterance of the word “Fixit” that magic things would happen and we’d all leap atop our mighty steeds and ride out together like the grand shining calvary we already once proved ourselves to be.

Is luck would have it, running another Testing Fixit was not as easy as it sounded, at least not from my point of view. Many of the volunteers from other offices who helped with the previous Testing Fixit agreed to participate in this new Fixit, and one of the VPs agreed to send out a supportive announcement email (that we’d prepare, of course). Testing on the Toilet would make sure word of the Fixit got out far and wide. For the more intensive activities, however, it seemed that most of the Testing Grouplet folks in Mountain View operated on the faith that “Mike’ll take care of it.” I unwittingly aided this perception by doing so much work to present to the Grouplet each week that, rather seeing all my energy and productive effort as evidence of progress and inspiration to contribute, they saw it as leaving little else for them to do. I got things done, but not as well as I’d’ve liked, and it damn near killed me—and Mamie, the only one crazy enough to get as deep into this new Fixit as me.

Click for a larger image.
Mamie Rheingold designed this flyer, keeping with the rock concert/green theme by using this sweeet PRS with a green flame maple top. She totally found that on her own; I had no say in it. The blotchy part at the bottom is a printing artifact, as this is a scan of a copy I held onto.

Fuzzy is for Peaches, Teddy Bears, and Walrus Dolls

On top of that, admittedly, the goal was a little vague: “More and better developer testing.” Not so concrete; turns out that matters. For the first Testing Fixit, we were clear that we wanted people to fix tests, write new tests for uncovered code, and improve their code coverage numbers. For this one…more of the same? Yeah, sure, OK. Somehow “more of the same” wasn’t as compelling as said “same” was the first time around. It’s not that more fixed tests, more new tests, etc. wouldn’t be a fine goal for another Testing Fixit by itself, but not so soon after such a similar one. Indeed, I remember Nick Lesiecki drilling me in his gentle-but-firm manner about the fact that the overall goal was lacking,² and I was certainly open to ideas, but neither I nor anyone else managed to come through with a superior concept. Hence, “more and better of the same”.

That would be the last time we ever had that particular problem. The Revolution, the Test Certified Challenge, the Forgeability Fixit, and the TAP Fixit were powerful in large part due to their specificity, and the fact that the specific goals were directly tied to significant, tangible, immediate benefits for the participants, in addition to contributing to the overall effort to change Google development culture—ultimately and concretely expressed as the Man on the Moon mission the Testing Grouplet developed in mid-2007: Ensure every team at Google reaches Test Certified Level Three by the end of 2009. These Fixits immediately presented participating developers and teams with better tools, faster builds, the adoption of specific testing goals and policies, and after the TAP Fixit, (mostly) pain-free, lightning-fast feedback on the health of each individual code change across all affected projects in the entire company—ultimately, freedom from fear, replaced by a perpetual state of confidence and flow. I mean, tools, testing, and TAP don’t make writing functional code and quality products easy, but they removed so many of the time-sucking obstacles and distractions that bedeviled Google development for years, because they finally gave Google development as a whole the power to do the right thing, and we followed through with providing the knowledge required to do the right thing well—and they did it, and still do it to this day.³

Despite the fuzzy goal, this Fixit was a bit more ambitious than the last; there was a smörgåsbord of potential tasks folks could choose from to make for “more and better developer testing”. Fixing tests, check. Writing new tests, check. But there was also: Work through a testing Codelab (wiki-based internal training document). Attend one of the testing Tech Talks being presented around the company that day. Adopt tools and practices from Test Certified Level One, Two, or Three. The Test Certified-related tasks would’ve made for a wonderful, specific goal, but this was before our “Man on the Moon” mission had snapped into focus. Consequently, they were just scattered throughout the menu alongside the rest.

Click for a larger image.
I believe Antoine Picard wrote this one. Notice that, at this point, Test Certified wasn’t even mentioned, despite the presence of coverage and small/medium/large labeling tasks. Unfortunately, T-shirts were mentioned, multiple times.

Also, in the interest in engaging as many developers as possible, I took suggestions from senior developers for tasks important to them that could be folded into the idea of fixing tests in general. As suggested by Ian Lance Taylor, we had a task to make your C++ code and tests 64-bit clean, since the fleet-wide 64-bit conversion was just getting underway. As suggested by Mike Burrows, we had a task to make sure the functions comprising your C++ code and tests do not exceed a particular stack size, as the 64-bit fleet conversion also meant more multicore processors and multithreaded application and library architectures, and functions with large stacks limit the viability of this model. Though we didn’t actually get a lot of participants tackling these tasks during this Fixit, and thus fell short of the dream of engaging more participants by having something-for-everyone, both of these efforts eventually took off and their missions were eventually fulfilled—which effectively benefited the practice of automated developer testing, no doubt, as C++ test binaries would eventually be executed in great numbers across production datacenters by Forge. We did include Ian’s “64-bit clean” task again in the Revolution Fixit, for reasons which will become obvious if they’re not already.

Future Fixits would continue to employ task menus, though as mentioned above, rather than being an odd scattering of discrete items in a misguided attempt to broaden participation, they were clearly concrete steps towards accomplishing a larger goal while providing immediate, tangible benefits in and of themselves. As a consequence of such context, they also generally had the nice property of being things that people would want to do either in advance of a Fixit or long afterwards, whether they received “credit” for it or not, because they could perceive the benefit of such tasks regardless.

Scoring

Along with the task menu came the concept of Fixit scoring, by giving different point values to each task and awarding prizes based on the highest overall score, as opposed to necessarily the most tasks accomplished. Scoring would go on to become almost as big a thorn in my side as T-shirts, and help prove to myself that I’m not that much of a dyed-in-the-wool geek after all. Why? Because I just don’t care about competition and high scores.

Scoring was a way to try to make things a little more fun and interesting, and to give weight to more difficult, more important tasks, while encouraging others to take care of lots of smaller tasks to compensate. But some people…man. They take this stuff seriously. Seriously enough to get into serious arguments over it, or to, well—let’s put this diplomatically: To expend effort in creative ways to maximize reward relative to actual time and energy spent. We’re talking about a minority to be sure, but still…establish a system, and in a sufficiently large population, someone will get upset over it, and/or someone will try to game it.

That’s not to say that there’s any good excuse for not doing a good job at setting up and maintaining such a scoring system. What I didn’t see coming, like with T-shirts, was all the frustration and pain that would come along with creating a scoring system as a consequence of the fact that some folks get really invested in such things. That was one of my failures as a Fixit organizer; I should’ve handed the task over to someone who enjoyed scoring systems more than I did, just as I did with so many other responsibilities that were not my forte. In the end, no one really complained about the end result, but what felt like extra pressure to me might’ve felt like a worthy challenge to someone else.

Fortunately for this Fixit, the scoring concept didn’t produce many issues. Fortunately for Google, ultimately, all of the scoring squabbles over time proved a mere surface distraction: We got the real job of changing Google development culture done in spite of such petty shortcomings.

Gladwell Mach I: Wire and Duct Tape

I have a dirty little secret: I’m not a web programmer. Most of what I know about HTML I’ve only learned since starting this blog, I’ve never touched PHP and only dabbled with Javascipt and Ruby on Rails starter projects, and my relational database and SQL experience is very limited. Still, I had the idea at the time that it would be nice to have a semi-dynamic web page for presenting the goals, organization, activity, and impact of this Testing Fixit, both to make it easier to demonstrate current progress and immediate impact, but also to build a sense of excitement around the Fixit. I say “semi-dynamic” because my eventual solution relied on me periodically executing a Python script to regenerate static HTML pages, served from my workstation.⁴ When Fixit Day came, I scheduled this script to run automatically once a minute using cron, and ta-da! Near-real-time Testing Fixit activity data on display.

One of the core components of this application was plotting volunteer and participant names on an embedded Google Map, using code I stole from Alan Donovan, one of the Build Tools tech leads, after he circulated something similar for the Build Tools team internal homepage. I built such a view to plot all of the Fixit volunteers in various Google offices worldwide, updating it as new volunteers agreed to join in. I was fascinated to watch the world growing increasingly-populated with Fixit volunteers, and couldn’t wait to see actual participants popping up on a similar map on Fixit day. The network effect was tangible: The excitement of seeing new people come online kind of heightened my own sense of energy and interest, and I felt fairly convinced that it would have the same effect on others.

Given the nature of this new Testing Fixit, this time we relied on participants reporting their own tasks as opposed to the automatic detection of the previous Testing Fixit. Of course, my webapp-fu was insufficient for me to code up a decent task-input interface (and still is). At the time, there was an internally-popular system for collecting structured data via a web form system called Sparrow. So in the interest of time, given that I was the only developer on this app, I pointed participants at a Sparrow page where they were to record their Fixit participation, and wrote a Python module to parse out this data, which was then used to generate the static pages required to report on overall Fixit activity and to plot participants on the Google Map widget.

The image at the top of this post contains a screenshot from this webapp. I would rewrite this app from scratch in C++ for the Revolution Fixit, with added features; that version was repurposed by Matt Vail and Tayeb Karim for the Test Certified Challenge. Eventually David Plass would take this idea and implement it as a proper Google App Engine app in Python, keeping the name I chose for it when I’d originally toyed with the idea of making a proper project out of it: Gladwell.

I’d chosen the name in honor of Malcolm Gladwell’s book The Tipping Point, which explains how “epidemics” happen, be they of disease, fashion, or ideas. The intent with the webapp was to visualize Fixit activity in such a way as to encourage further participation, creating a “tipping point” whereby an initial nucleus of activity spills over across all of Google development thanks to the network effect of broad observation of visualized participation data. I’ll be discussing Gladwell a little more in future posts, to the extent I’m able.⁵

T-shirts

This Testing Fixit would provide the profoundly bitter experience that would sour me on T-shirts as schwag forever. Mamie did a wonderful job designing and procuring them—I mean, they’re beautiful, elegantly minimal, with a geek-friendly green-on-black palette, hinting at a rock-concert theme—but they ultimately brought more pain and suffering than they were worth. All—every last one—of the principles outlined in my earlier post explaining why T-shirts are the Fru-its of the Deveel were based directly on aspects of the March 2007 Testing Fixit T-shirt experience.

The Testing Fixit 2007 T-shirt designed by Mamie Rheingold. Click for a larger image.
Those of us helping organize the Fixit had “STAFF” on the back of our shirts, keeping with the rock concert theme hinted at by “testing 1, 2, 3”, the microphone, and, of course, “testing rocks.”

Scheduling and Fixit Fatigue

International Women’s Day is March 8. I never knew that before this Testing Fixit, but would never forget it since. I was alerted to its existence when volunteers from Moscow asked if they could run the Fixit a day early—or late, I can’t remember exactly—because IWD fell on Fixit Day and was a Russian national holiday. As an American, I sure didn’t see that one coming. I felt kinda dumb for not taking into account national holidays for each of our international development offices when picking the Fixit Day.

Also, the concept of “Fixit fatigue” began to emerge around this time. If there are too many Fixits happening, such that developers are being frequently lobbied by relative strangers to stop with their normal work and do something else for the sake of said strangers’ cause, the reserve of goodwill dries up pretty quickly. Maybe you haven’t run your brand of Fixit in a year, but if an unrelated group also runs a big Fixit just a month before yours, don’t expect a great deal of enthusiasm and participation for your event. It’s hard to say for sure whether this Testing Fixit suffered from it, but given only seven months since the previous Testing Fixit combined with a unsatisfyingly vague objective, enthusiasm and participation did not appear to be what it was for the earlier Testing Fixit.

Consequently, one of the primary functions I later helped establish as the Fixit Grouplet’s responsibility was that of scheduling large-scale Fixits across the entire company. The Grouplet members all took a set of international holiday calendars, available internally, and created a composite calendar marked with all of the national and international holidays for every office. We also took note of the beginning of each quarter, the end of each quarter, performance review (aka “perf”) seasons, large-scale quarterly Ads Fixits, regional ski trips, and Burning Man (since a good chunk of the Mountain View office would be unavailable that week). Basically, there would be a window three or four possible dates in the middle of each quarter that would make for good Fixit days, and we’d work with groups to ensure there was no more than one large-scale, company-wide Fixit per quarter. Thanks to a well-placed Testing on the Toilet episode, the company quickly learned of the Fixit Grouplet’s existence and resources, and we were able to work with different groups looking to schedule large-scale Fixits without unforeseen collisions.

Well, once there was one group that started promoting its own large-scale Fixit without consulting with the Fixit Grouplet, and I firmly encouraged its organizer to please back off and run it at a later date due to a conflict with another Fixit. The organizer did agree, even though I lacked any authority whatsoever, even as the Fixit Grouplet leader, to compel such action. At the end of the day, I hope it was clear that it wasn’t a matter of office politics aimed at building petty empires, but a matter of everyone choosing to work with rather than against one another, to ensure the best outcome for all. When you can pull off that argument, solely through persuasion rather than authority, you’ve got a healthy culture.

Fixit Day

I was already exhausted by the time March 8, 2007 rolled around. Mamie and I had done the lion’s share of the organization, coordination, and logistics, plus I had spent most of my time outside of the office neglecting my then-girlfriend in favor of working on the Fixit webapp.⁶ This time the war room was Tech Talk 42, the large(ish) open space in Building 42 where most of the big tech talks happened back then. It had two things going for it: a large projection screen, upon which the Fixit webapp projected the up-to-the-minute participation stats; and lots of foot traffic. Mark Striebeck loved this setup, and almost exactly three years later we did the same thing for the TAP Fixit.

We set up shop there first thing in the morning, and the T-shirts were delivered. The T-shirt situation was a disaster. My hapless volunteers from other buildings across the Googleplex, which by now included buildings across Charleston Road and further down the street a ways, started arriving to rummage through the boxes and carry what they could back to their buildings. They had to guess as to the distribution of sizes they’d need, and to make matters worse, some of the STAFF shirts that I’d promised to these volunteers were delivered to parts unknown, so some had to make do with a “normal” Fixit shirt. Thankfully, they were all good sports about it, but I felt positively awful.

There was one early bright spot the T-shirts brought to my day, however, when I accidentally overheard Michelle Levesque commenting to Mamie Rheingold: “Did you notice where the microphone falls?” They noticed me overhearing them, and I tried to carry on with my best Keeping-It-Legal game face on. But still, I never forgot.

I had great company in the war room for most of that day. Ana Ulin and Henner Zeller were there from the Zürich office, and I recall Antoine Picard spending a good deal of time there. Michelle Levesque was dispatching the Noogler Army to deliver T-shirts to participants, as during the previous Testing Fixit, according to the Noogler Dispatch Protocol. Still, I felt stressed by the image of the web app up on the big screen for everyone walking by to see, with participation in the low hundreds amongst an development population that was now in the thousands. That, and I had to watch the T-shirt piles like a hawk, given that Googlers, culturally-conditioned to expect free T-shirts as they are free meals, ski trips, and espresso machines, would occasionally hover about, ready to make off with handfuls of T-shirts without compunction.

By the late afternoon, the participation numbers picked up a bit, but other folks in the war room began to trickle out, while I stayed behind to make sure I was around if people had questions or needed to collect a (by this time, unlikely to have in the correct size) T-shirt. There were precious few such instances, but I was there ’til about 10pm, if I recall correctly, just to make sure. For a long time, I sat alone in Tech Talk 42, a big projection on the screen, with hardly anyone else walking by. When I was reasonably confident there’d be no more participants, I closed up shop, carried the remaining boxes of T-shirts back to my desk, and went home.

Aftershock

By the end of the actual Fixit Day, there were only about 320 recorded participants. I felt pretty let down by this, but a strange thing happened: Over the course of the next couple of weeks, people kept participating, kept submitting tasks, and asking for T-shirts. I let the registration system stay open for as long as it seemed people were still working on Fixit tasks, and it eventually occurred to me that perhaps another reason for the seemingly-low participation numbers was that most folks didn’t bother registering their tasks. During the previous Testing Fixit, we rewarded some folks who weren’t even totally aware that they were participating in a Testing Fixit; though I never held a survey to find out, or heard any complaints, it’s likely that some people might’ve expected we’d’ve automatically detected their work and rewarded them. And if they weren’t motivated to register their tasks in the first place, they probably weren’t motivated to follow up and register later, or to ask for a reward. If this theory holds, such folks were lurkers, essentially, deriving motivation from the Fixit activity without feeling the need to seek a reward beyond performing tasks for their own benefit. It is just a theory—a hypothesis, really—but I do argue that many of the folks who registered their tasks in the weeks after the Fixit Day were, essentially, lurkers who decided to stop lurking and get “official credit” for their tasks anyway.

By the time the registration was finally closed, we’d recorded 535 participants across twenty-nine Google development offices, as you can see in the image at the top of this post. Still not a huge fraction of the total Google development population, but significantly more than the 320 registered during the Fixit Day. This display of motivation to continue working on Testing Fixit tasks was encouraging.

Also during these weeks following the Fixit, I had a pretty regular stream of participants coming by my desk to claim T-shirts; if not for themselves, because we ran out of the “good” sizes, possibly for a friend or family member. One of my officemates at the time, Guido van Rossum, started getting slightly peeved at all the T-shirt traffic, as it was mildly disruptive. But despite the disruption, and the hair-pulling hassle of dealing with T-shirts, it was rewarding to have so many developers I’d never met come by to say hello, giving me the chance the personally thank them for participating in the Testing Fixit and helping spread the practice of automated developer testing at Google.

Roles

I decided to write a retrospective document to take stock of everything that happened. In addition to fulfilling a sense of responsibility to demonstrate that all the time and resources spent on this Fixit were useful, I wanted to answer a question: Why was this Testing Fixit so much more difficult than the first one?

As it would turn out, for most people, this Testing Fixit seemed easier than the first one. Mark Striebeck would comment that he thought this Testing Fixit was remarkably well-run, which mystified me. Why did others perceive it so? Because, as noted earlier, I’d taken too much responsibility for too much of the Fixit. Well, me and Mamie both. So the question became: What exactly would I say it was that we did here?⁷

Taking stock, I came up with a list of responsibilities that Mamie and I largely fulfilled ourselves. This list was later expanded and refined by the Fixit Grouplet into a set of canonical “Fixit Roles”, enumerated here and labeled with their official names and technical names in parentheses:

The Walrus (Organizer): Ultimately responsible for the organization and success of the entire effort; sets the vision and goals, and assists the rest of the core Fixit team in fulfilling their roles, removing obstacles and making decisions when necessary
Prime Mover (Manager): Ensures tasks are assigned and executed, and that the logistics of the entire Fixit operation are on track
Contact Czar/Czarina (Contact/Volunteer Coordinator): Recruits volunteers and keeps them engaged, making sure they’re equipped with flyers and other propaganda, and responding to their requests and concerns
Schwagmeister (Prize Coordinator): Designs, procures, and distributes T-shirts, or schwag generally speaking
Minister of Propaganda (Publicist): Designs and distributes flyers and other promotional media
Minister of Information (Tech Writer): Maintains the documentation for both volunteers and participants, both operational and technical
Scheduler (Schedule Coordinator): Reserves conference rooms, tech talk rooms, and war rooms across offices participating in the Fixit
Hacker (Tools Developer): Develops and administers the webapp that tracks and encourages participation, and/or other relevant tools
Minister of Communication (Communication Coordinator): Keeps communication channels open and flowing between all volunteers and participants
Field Commander (War Room Coordinator): Recruits volunteers for and manages a local war room, where volunteers within an office congregate to collaborate on technical issues, and where participants can walk in to seek assistance
Minister of Phynance (Accountant): Manages the budget for schwag and prizes
Cat Herder (Tech Talk Coordinator): Commissions and schedules tech talks throughout the various offices
Mayor of Jonestown (Noogler Coordinator): Focuses on recruiting Nooglers (new Googlers) to aid with publicity and schwag delivery, as part of their introduction to Google development culture⁸
Heart and Soul (Recognition Coordinator): Ensures Fixit volunteers are thanked, recognized, and rewarded⁹
Damned Liar (Statistician): Collects statistics on participation and impact, and reports thereon
Historian (Historian): Documents the history and lessons of the Fixit for the benefit of future Fixit organizers

As David Plass eventually observed, dividing Fixit responsibilities up into roles is just like running a business. The thing with Fixit roles, however, is that this is serious business that is essentially 20% volunteer-driven. Tokens of appreciation and peer bonuses aside, nobody’s really getting paid extra or gaining broad official recognition for doing this kind of work. There’s also zero authority available with which to compel others to participate. Given Google’s ever-expanding development organization, this Testing Fixit impressed upon me that larger-scale Fixits were threatened with extinction absent a clear division of responsibilities, yet making things too formal would suck the fun and excitement out of the process to the point where no one would want to participate.

Consequently, these Fixit roles were so named not just to assign very clear responsibilities to individuals, but to make sure the only people who signed up for duty “got it”. How can you power trip with a role like Minister of Propaganda? Damned Liar? The Walrus? How could you not smile with both pride and amusement at being a Schwagmeister, a Field Commander, a Cat Herder? At the same time, having accepted such a role, your place in the Fixit universe was clear, as was your relationship with other volunteers. Sure, there’s a good amount of overlap between roles, and everyone’s empowered to take care of issues outside the rubric of his/her role if so inclined and able, but when an issue arises that you can’t immediately handle, you know exactly who to hand it off to, and remain reasonably secure that it will get handled. And you knew what other people were counting on you to bring to the table.

The purpose of these roles wasn’t to use them to control people; quite the contrary, once a role was accepted, it was largely up to the individual to bring his/her own energy, motivation, and creativity to the process of fulfilling the role. The boundaries of a role would set one free from the ambiguity of not knowing what to do or how to interact with other volunteers. I saw my role as The Walrus to facilitate everyone else’s activity, to make decisions and solve problems and make course corrections only when necessary, to maintain fingerspitzengefühl while everyone operated according to commander’s intent. I always accepted ultimate responsibility for any issues arising from a Fixit, but the credit for its success went to everybody else, from whose passion and ingenuity emerged outcomes far beyond what I could’ve imagined.

After consciously applying Fixit Roles successfully to the Revolution Fixit in January 2008, I would open the initial [August 2008 NYC Testing Summit][mercs-summit] and March 2010 TAP Fixit planning meetings by having a list of these roles with my name next to each in bold, red type—except for The Walrus, which was in normal, green type—and announce to everyone that the ultimate success of the venture would be inversely proportional to the number of these roles with my name remaining next to them in red. By the end of each of those initial meetings, nearly all of the roles would have someone else’s name next to them in green. From there, people would immediately start to kick ass and take names, and my job would be mostly to just provide whatever assistance they needed, and to watch in amazement as everyone brought their A-game and made magic together.

Epilogue

In the end, this Testing Fixit was a success to some degree, as it did get several hundred folks to participate—and potentially hundreds more who completed tasks they didn’t bother to register. However, the more important impact the Fixit had was not immediately apparent. The critical role this particular Testing Fixit played in the Testing/Tools Fixit Arc became clear only in hindsight.

Soon after this Testing Fixit, I finished off my minor backlog of work on the SrcFS backend and joined the Test Mercenaries. Also, I thought I was done running Fixits—I was sure I had no more in me after the pain and exhaustion of this most recent one. But given my experience running two large-scale Fixits, I had set myself up to succeed David Kramer as Fixit Grouplet chair, which I eventually did the following September. Plus, about a month and a half before assuming that role, I was bitten by the idea leading to the Revolution Fixit. Though this Testing Fixit was not important in terms of taking the Testing Grouplet to the next level of actual culture change, it produced critical insights into the process of running a large, successful Fixit that served the Revolution Fixit very well and gave the Fixit Grouplet lots of substance to work with.

“More and better developer testing” wasn’t a concrete enough goal. Though we got some attention, and folks did participate and get T-shirts and whatnot, it didn’t feel like the Testing Grouplet was as hungry and fired up as before, like it had that “we’re going to prove ourselves to the world” kind of energy. The concrete “Man on the Moon” mission of getting all Google development teams to Test Certified Level Three by the end of 2009, formulated the summer after this Testing Fixit, meant no more wishy-washy “more and better” testing goals. Never again would we run a Fixit without a concrete goal that got us towards the more strategic “Man on the Moon” goal in a tangible, measurable fashion that fired up both the Testing Grouplet and Google development at large. We could never measure the impact our efforts had on the bottom line of company profits, but we could see how many developers adopted improved testing policies and practices, as well as more powerful tools, which everyone intrinsically valued despite the fact that their impact on the bottom line isn’t any more direct.

And as a final legacy, from this Testing Fixit forward, my official Fixit policy became: No…more…T-shirts…EVER!!!

Footnotes

It’s also interesting to note that the first Testing Fixit took place on 8/3, and the second on 3/8—or the other way ’round for y’all non-USians. ↩
Just as he did about the drinking game I played during a Testing Grouplet meeting where I videoconferenced into Mountain View from Zürich in the middle of my Tools and Testing World Tour, knocking back a constant stream of Jäger shots while a beloved but notoriously loquacious fellow member got rolling in the last fifteen minutes of the meeting, while Michelle Levesque turned helplessly between said member, me going at it on the video screen, and a giggling Neal Norwitz across the table from her, with her instant-messaging me desperately to please stop, because she had nowhere to turn and was about to explode. Yeah, as per Nick’s advice following the meeting, I didn’t do that or anything like it ever again—though, in my defense, it was Ana Ulin and Henner Zeller’s idea to take me shopping for Jäger before the meeting. I also promptly removed the Jäger from the fridge in the first-floor microkitchen of the Zürich office when a gentle reminder from the local HR director circulated amongst the Zooglers over email during my stay there, which I never received directly but had forwarded to me from several others, that bringing such substances on company property was technically verboten. ↩
Granted, I often hear old colleagues complain that people aren’t writing great tests all the time, but that’s a different problem, a nicer one to have than the whole “I don’t have time to test (in the first place)” problem. I rejoice that they’re writing tests, without argument, making sure the code isn’t broken for everyone along the way, and getting their hands dirty and contributing to the experience pool. Sure, there’s the risk that folks are making mistakes leading to less-than-ideal outcomes in the short-term, but there’s no perfection without practice. ↩
As noted in the footer of every page on this site, I generate static HTML for this blog using Jekyll. It contains no Javascript. It is not, however, being served from my personal workstation—though given the effective queries-per-second and corresponding server load, it almost certainly could be. ↩
I have to admit, I felt a little creepy after I moved to New York and started seeing Mr. Gladwell around my neighborhood, all the time, even walking up behind him in line at my favorite coffee shop. Eventually, I even watched him walk past me down the sidewalk and into his home. It feels wrong knowing where he lives. That didn’t stop me from eventually pointing out his residence to David Plass, however. ↩
Even in retrospect, I don’t regret this at all. It wasn’t going to work out anyway, my workaholism aside. ↩
Man, what is up with IMDB’s paltry collection of Office Space quotes? Then again, guess they can’t put the entire script up. The final line of the scene I want to reference is there, at least. ↩
As noted earlier, this was Michelle Levesque’s role during this Testing Fixit. ↩
At Google, of course, recognition is largely its own reward. ↩