This third post in the Making Software Quality Visible series describes the next steps of my personal journey with software quality and automated testing.
I’ll update the full Making Software Quality Visible presentation as this series progresses. Feel free to send me feedback, thoughts, or questions via email or by posting them on the LinkedIn announcement corresponding to this post.
Side note: Image format differences between the blog and email
By the way, if you’re reading this on my website, the illustrations below will look really nice, since they’re inline SVG elements. Their colors adapt to the light and dark modes of the blog.
However, if you’re subscribed to my email list, you should know that many HTML features do not work in many email clients. This includes the fact that, sadly, few email clients support SVG, inline or otherwise. Dark mode, or more precisely @media queries, are also poorly supported, so offering different image versions to match both light and dark modes isn’t trivial. (There appear to be hacks out there, but they’re beyond me at this point.)
So as a compromise, I’ve created PNG versions of the inline SVG images using a light gray background specifically for HTML emails. It’s not as great of an experience as I’d prefer to offer, but hopefully they’re sufficiently legible in both light and dark mode emails. If they’re not, click the post title in the email to read the web version with the proper SVG images.
(Naturally, I may blog about this one day, too. I still have my dark mode and other CSS hacks to cover eventually as well…yay for git history!)
If you’re reading this directly from my web feed, well, I’m curious to see how the SVGs render there for the first time. If they don’t, or render poorly, I’ll see what I can do to apply the same PNG images in the feed.
Now back to our regular program—picking up right after the previous post, Formative Experiences at Northrop Grumman Mission Systems…
Google: Testing Grouplet
At Google I joined the Testing Grouplet, a team of volunteers dedicated to driving automated testing adoption. My talk “The Rainbow of Death” tells the five year story of the Grouplet. I’ll give you the fast forwarded version of that talk right meow.1
Rapid growth, hiring the “best of the best,” build/test tools not scaling
When I joined in 2005, the company was growing fast,2 and we knew we were “the best of the best.” However, our build and testing tools and infrastructure weren’t keeping up.
Lack of widespread, effective automated testing and continuous integration; frequent broken builds and “emergency pushes” (deployments)
Developers weren’t writing nearly enough automated tests, the ones they wrote weren’t that good, and few projects used continuous integration. As a result, code frequently failed to compile, and errors that made it to production would frequently lead to “emergency pushes,” or deployments.
Resistance: “I don’t have time to test,” “My code is too hard to test.”
We kept hearing that people didn’t have time to test, or that their code was too hard to test.
(Mostly) smart people who hadn’t seen a different way
Basically, these were (mostly) smart people who didn’t know what they didn’t know, and couldn’t afford to stop and learn everything at once.
We had to identify how to get the right knowledge and practices to spread over time.
Geoffrey A. Moore, Crossing the Chasm, 3rd Edition
Different people embrace change at different times
The “Crossing the Chasm” model from Geoffrey Moore’s book of the same name helps to make sense of our dilemma.3 At a high level, it illustrates how different segments of a population respond to a particular innovation.
Innovators and Early Adopters are like-minded seekers, enthusiasts and visionaries who together bring an innovation to the market and lead people to adopt it. I like to lump them together and call them Instigators.
The Early Majority are pragmatists who are open to the new innovation, but require that it be accessible and ready to use before adopting it.
The Late Majority are followers waiting to see whether or not the innovation works for the Early Majority before adopting it.
Laggards are the resisters who feel threatened by the innovation in some way and complain about it the most. They may potentially raise valid concerns, but often they only bluster to rationalize sticking with the status quo.4
The Instigators face the challenge of bringing an innovation across The Chasm separating them from the Early Majority, developing what Moore calls The Total Product. Developing the Total Product requires that the Instigators identify and fulfill several needs the Early Majority has in order to facilitate adoption.
As Instigators, the Testing Grouplet focused its energy on connecting with other Instigators and the early Early Majority to deliver the Total Product. We largely ignored the highly vocal Laggards.5
The Rainbow of Death
This connection across the chasm isn’t part of the original Chasm model, but one I borrowed from a friend6 and called “The Rainbow of Death.” It helps illustrate those Early Majority needs the Instigators must satisfy. Doing so transforms the Early Majority from being dependent on the Instigators’ expertise into independent experts themselves.
Five years of chaos…
…and one Rainbow to rule them all
I’ll now use the Rainbow of Death to show how the Testing Grouplet eventually brought that Total Product across the Chasm.
[Note that the following steps are animated in the actual presentation, filling in the Rainbow of Death graphic one block at a time.]
Intervene + Empower: Of course there were already teams working to empower developers by improving development tools and infrastructure. But as you can see, there’s still a large gap between delivering tools and helping people use them well.
Mentor: That’s where the Testing Grouplet stepped in.
Inform: We started by training new hires, writing “Codelab” online training modules, hosting Tech Talks, and giving out tons of free books. But we noticed people weren’t necessarily reading those books, or otherwise applying the knowledge we were sharing. So we transformed from a “book club” to an “activist group.”7
Inspire: We shared the Google Web Server success story…
Validate: …then distilled that experience into the Test Certified roadmap program. This program was comprised of three levels containing several tasks each. It removed friction and pressure for teams by providing a starting point and path for focusing on one improvement at a time.
Mentor: We also offered volunteer “Mentors” to guide teams through the process and celebrate success…
Inspire: …and physical, glowing “build orbs” to monitor their build status.
Intervene: We eventually built the Test Mercenaries internal consulting team to work with more challenging projects on climbing the “TC Ladder.”
Inform: And our biggest hit was our Testing on the Toilet newsletter, appearing weekly in every company bathroom.
Inspire: We eventually focused on getting every team to operate at Test Certified Level Three, whether they were officially enrolled or not.
Inspire: All of this was punctuated by four “Fixits,” companywide events to address “important but not urgent” issues. Our Fixits inspired people to write and fix tests…
Empower: …to adopt new build tools,8 and finally…
Empower: …to adopt the Test Automation Platform continuous integration system.
These efforts made quality work and its impact more visible than it had been. This helped people write better tests, adopt better testing practices and strategies, drastically improve build and test times, reduce bugs, and increase productivity. But perhaps the most visible result was scalability of the organization.
Google: Testing Grouplet results
2015, R. Potvin, Why Google Stores Bills. of LoC in a Single Repo
Rachel Potvin presented the following results in her presentation from @Scale 2015, “Why Google Stores Billions of Lines of Code in a Single Repository.” They may seem quaint to Googlers today, but they speak to the Testing Grouplet’s enduring impact five years after the TAP Fixit.
- 15 million LoC in 250K files changed by humans per week
- 15K commits by humans, 30K commits by automated systems per day
- 800K/second peak file requests
Of course, the Testing Grouplet isn’t responsible for all of this; Rachel’s talk describes an entire ecosystem of tools and practices. Even so, she states very clearly that:
- “TAP is our automated test infrastructure, without which this model would completely fall apart.” (13m:36s)
Also, it may amuse you to know that Testing on the Toilet, started in 2006, continues to this day!9
Coming up next
I’ve built up a few posts that I should be able to release somewhat quickly, though I’ll try to pace myself.
The next two posts will expand on the footnotes regarding working successfully with a Laggard and how that inspired the Revolution Fixit.
Then, as a break, I’ll explain my method for converting Keynote images to SVGs that I use here in the blog. It’s a little involved, not terribly automated, and requires a little time and care, but it’s relatively easy once you get the hang of it. (Or once you write down the instructions so you don’t have to figure it out every time, that is. This time, I created an Obsidian note with all the details that I’ll use to write that post.)
The post after that will get us back into the series. I’ll summarize my mistakes and lessons learned while trying to recreate the Testing Grouplet experience—and how I put those lessons to good use at Apple.
The Crossing the Chasm model can be traced back to Everett Rogers’s Diffusion of innovations model from 1962. That model differentiated the five populations, but lacked a “chasm.” The chasm was added by Lee James and Warren Schirtzinger of Regis McKenna Inc., where Moore also worked.
Articles that dig into the Chasm’s history include:
The first article above presents a number of criticisms of the Crossing the Chasm model. Like criticisms of the Test Pyramid model, I think they split hairs and miss the point. Not because their points aren’t valid, but because they’re better presented as further refinements for consideration after grasping the concept, not criticisms of the model.
No model is perfect, but a good one is at least effective at bringing new people into the conversation. Once they’re in, and comfortable with the concepts and the language, we can point out nuances not captured by the model. But without the model, people may not gain access to the conversation to begin with. ↩
I’ve had people suggest that Laggards are actually the dominant population, comprising the actual majority. I remind them that it only seems that way—they’re the most vocal because they feel they have something to lose. Once both Majorities adopt an innovation, their voices lose power. ↩
I do have one story about engaging a Laggard that had a genuinely happy, “everybody wins” ending. I’ll cover it in my next blog post. (If you can’t wait ‘til then, and want to read about it now, find the footnote corresponding to this one in Making Software Quality Visible.) ↩
Albert Wong, former Googler and member of the U.S. Digital Service. I saw his original model in his presentation on his early work as a member of the USDS, working with Citizenship and Immigration Services. In my mind, I instantly saw it snapping into the Chasm—and helping me make sense of the Google Testing Grouplet’s story.
I asked Albert if I could borrow the model, and he agreed. I also asked if he minded me giving it a funny name, and he didn’t.
The multicolored span of the model reminds me of rainbow, and my weird sense of humor inspired me to pair it with an incongruous concept. Hence, “The Rainbow of Death.”
Two years after I started using the model, I realized how the concept of “Death” actually fits. The model helps explain how the problem you want to solve may not be the problem you have to solve first. To achieve that insight, old ideas about the problem and what’s required to solve it have to die to make room for new ideas.
For example, the Testing Grouplet wanted to improve automated testing and software quality—but we had to figure out how to sell others on it. We eventually realized we needed to do more than train new hires once, host tech talks, and give out books. We kept doing those things, but we couldn’t only continue putting information out there in the hopes that people would use it. We realized we needed to get people more directly engaged—leading to Testing on the Toilet, Test Certified, the Test Mercenaries, and a series of Fixits. Our work also influenced build and testing infrastructure development, culminating in the launch of the Test Automation Platform.
More to come in a following footnote/blog post… ↩
This was how Testing Grouplet co-founder Bharat Mediratta described the choice we had to make about how to operate. ↩
The Revolution Fixit was the third Google-wide testing Fixit I organized, helping set up the TAP (Test Automation Platform) Fixit two years later. This event introduced Google’s now famous cloud based build and testing infrastructure to projects across the company.
I’ll briefly cover this event and its impact at a high level in the blog post after next. (Or, as mentioned in the previous footnote, find the footnote corresponding to this one in Making Software Quality Visible to read about it now.) ↩