Revolution Fixit at Google

My Laggard encounter eventually inspired me to organize the Revolution Fixit in January 2008, launching new tools that drastically reduced suffering all across Google.

09 Aug 2023
Tags: Fixits, Google, Making Software Quality Visible, Test Certified, Test Mercenaries, Testing Grouplet, Testing on the Toilet, TotT, grouplets, programming, testing

This fifth post in the Making Software Quality Visible series very briefly summarizes the Revolution Fixit at Google that I organized in January 2008. This event launched Google’s cloud based build and test infrastructure, plus Blaze (a.k.a Bazel), dramatically reducing build and test times and suffering. It expands on the second of two footnotes from the third post in the series, Formative Experiences at Google. The first footnote was covered in the previous post, Rare Happy Ending with a Laggard.

I’ll update the full Making Software Quality Visible presentation as this series progresses. Feel free to send me feedback, thoughts, or questions via email or by posting them on the LinkedIn announcement corresponding to this post.

Footnote from "Five years of chaos…"

The Revolution Fixit was the third Google-wide testing Fixit I organized, helping set up the TAP (Test Automation Platform) Fixit two years later. This event introduced Google’s now famous cloud based build and testing infrastructure to projects across the company. I named it after my favorite Beatles tune (tied with “I Am the Walrus”), leading to spectacularly Beatles-themed announcements, advertisements, prizes, etc.

My aforementioned experience setting up the continuous build system after engaging with a Laggard became the genesis for the Revolution. Because the standard tools at the time were so slow and painful to use, I tried using a couple of experimental tools in the build. SrcFS was an officially supported source control cache that hadn’t yet been released for general use. Forge was a distributed build and test execution system developed by someone outside of the tools team as a 20% project. Together, these tools immediately brought the full build and test cycle times from about 1h 15min to about 17min.

The only thing was, these new tools were far more strict. Each BUILD target had to fulfill two conditions in order to be able to use them:

All input artifacts had to be in source control. At the time, because the existing source control system was overloaded and slow, many projects maintained larger artifacts in NFS. While some artifacts might’ve been “version controlled” via naming conventions, there was no enforcement of this, automated or otherwise, rendering verifiably reproducible builds effectively impossible. (We also called these “non-hermetic” builds, as opposed to “hermetic” builds in which all inputs were properly version controlled.)

In order for Forge to successfully build a project, all of its inputs had to be accessible via source control only, full stop. It couldn’t access NFS even if it wanted to.
The dependency graph had to be completely specified. Many projects at the time contained two common yet trivially fixable dependency problems:
1. Undeclared data dependencies for build and test targets. This was sometimes related to not having all input artifacts in source control, but many times they already were.
  
  The fix was to add the necessary declarations to the affected build or test targets that should’ve been specified to begin with.
2. C++ headers that were not #included directly in the files that needed them. My memory has grown a little fuzzy regarding exactly how this broke Forge builds, but I’ll try to recall it as best I can.
  
  Forge would scan each source file for the headers it needed to build and ship them to the remote build machine. However, I don’t think the scan was as complete as the C preprocessor, at least not at that time. Either that, or the SrcFS-aware Forge frontend restricted the search paths more so than the previous build frontend. At any rate, Forge builds worked perfectly well if all necessary headers were #included directly where they were needed. However, if a file only indirectly #included a header containing definitions used directly within that file, the Forge scan might not find it. Then, if any of the missing headers weren’t standard system or compiler headers available on the remote system, the build would break.
  
  The fix was to add the necessary #include declarations to the source files and any necessary package dependencies to the BUILD targets for those files. Again, these declarations and dependencies should’ve been specified to begin with.
The officially supported tools at the time unintentionally allowed these dependency problems to pass through for years. Forge, however, had to be far more strict, since it had to ship everything it needed for building and execution to remote machines.

Also, in both cases, the fixes to get the code building in Forge were trivial, by explicitly declaring existing dependencies.

Granted, there were other obstacles to using Forge for some builds and tests that took more time to resolve. However, failure to meet these basic conditions was the most common issue by far. Satisfying these conditions proved relatively quick and easy, and created more time and space to tackle the larger problems eventually.

Since I’d already run two company wide Testing Fixits (in 2006 and 2007), I immediately saw an opportunity to lead another new Fixit. Hence, the Revolution Fixit, which rolled out SrcFS, Forge, and also Blaze (the original name for Bazel). Between these three, we could realize the dream goal of “five minute (or less) builds” for every project. Once projects checked their build inputs into source control, and properly declared their dependencies, then suddenly their build cycles dropped from hours to minutes.

Also, solving these dependency problems and speeding up build and test execution times suddenly made a lot of code much easier to test.

Faster builds plus two years of publishing Testing on the Toilet and promoting Test Certified dramatically weakened the “I don’t have time to test” excuse. People had gradually learned what the “right thing” was with regard to testing by that point, and finally had the power to do it.

One of the neatest things about the Revolution was that, for weeks afterwards, I would hear people talking about “Revolutionizing” their builds. Even though not every project participated fully on the Fixit day, within a year, every project had migrated to the new infrastructure. I compare the before and after effects in Coding and Testing at Google, 2006 vs. 2011.

In the two years after the Revolution:

Other Testing Grouplet members organized a three month, interoffice Test Certified Challenge which greatly increased participation.
The Build Tools team organized a Forgeablity Fixit to try to resolve all remaining Forge blockers.
Then I organized the Test Automation Platform Fixit, essentially concluding the Testing Grouplet’s primary mission after five years.

I hadn’t written in detail about either the Revolution or the TAP Fixit before I ran out of steam writing about Google in 2012. Time has passed, and many memories have faded, but one day I may yet try to share more of what I’m able.

Again, one can’t always expect such productive, high impact outcomes as a result of engaging with a Laggard. It’s even possible I might’ve stumbled upon the idea for the Revolution eventually even without that encounter. But having had the encounter, it demonstrates the virtue of finding common ground, versus bearing a grudge and working against one another out of spite.

Coming up next

The next post will take a quick break from the Making Software Quality Visible series, covering how I convert Keynote images to SVGs.

Converting Keynote graphics to SVG

Following that, I’ll continue the series by summarizing my mistakes and lessons learned while trying to recreate the Testing Grouplet experience. More specifically, I’ll describe how I put those lessons to good use at Apple.

Formative Experiences at Apple