This sixth post in the Making Software Quality Visible series describes the next steps of my personal journey with software quality and automated testing.
I’ll update the full Making Software Quality Visible presentation as this series progresses. Feel free to send me feedback, thoughts, or questions via email or by posting them on the LinkedIn announcement corresponding to this post.
Picking up after the previous three posts:
One more time…
After the Testing Grouplet succeeded, I worked on websearch indexing for a couple of years. Then I burned out, dropped out, and tried the music thing again. It obviously didn’t really work out, or I wouldn’t be speaking to you now.
Apple’s goto fail
Finding More Than One Worm in the Apple, CACM, July 2014
The beginning of my descent back into the tech industry began in February 2014, thanks to Apple’s famous “goto fail” bug.
|Requirement||Apply algorithm multiple times|
|Assumption||Short algorithms safe to copy|
|Reality||Copies may not stay identical|
|Outcome||One of six copies had a bug|
|Impact||Billions of devices|
Requirement: Apple had to update part of its open source Secure Transport component which applied the same algorithm in six places.
Assumption: The developers apparently assumed that this short, ten line algorithm was safe to copy in its entirety, instead of making it a function.
Reality: One problem with duplication is that the copies may not remain identical.
Outcome: As it so happened, one of the six copies of this algorithm picked up an extra “goto” statement that short circuited a security handshake.
Impact: Once it was discovered and patched, Apple had to push an emergency update to billions of devices. It’s unknown whether it was ever exploited.
The complexity produced by copying and pasting nearly-but-not-quite-identical code yielded poor quality that masked a horrific defect. My article “Finding More Than One Worm in the Apple” explains how this bug could’ve been caught, or prevented, by a unit test.
|Requirement||Echo message from request|
|Assumption||User-supplied length is valid|
|Reality||Actual message may be empty|
|Outcome||Server returns arbitrary data|
|Impact||Countless HTTPS servers|
Requirement: In April 2014, OpenSSL had to update its “heartbeat” feature, which echoed a message supplied by a user request.
Assumption: The code assumed that the user supplied message length matched the actual message length.
Reality: In fact, the message could be completely empty.
Outcome: In that case, the server would hand back however many bytes of its own memory that the user requested, including secret key data.
Impact: Countless HTTPS servers had to be patched. It’s unknown whether it was ever exploited.
My article “Goto Fail, Heartbleed, and Unit Testing Culture” explains how this bug could’ve been caught, or prevented, by a unit test. It describes how the absence of a rigorous testing culture allowed a fundamentally flawed assumption to endanger the privacy and safety of millions. It also shows how to challenge such fundamental assumptions and to prevent them from compromising complex systems through unit testing specifically.
Apple: Quality Culture Initiative
Shortly after that, I was lured back into technology and eventually ended up at Apple in November 2018, which I left in November 2022. At Apple, I joined forces with a few others1 to start the Quality Culture Initiative, another volunteer group inspired by the Testing Grouplet.
Rapid growth, hiring the “best of the best,” build/test tools not scaling
When I joined Apple in 2018, the company was growing fast, and we knew we were “the best of the best.” However, our build and testing tools and infrastructure weren’t keeping up.
Widespread automated and manual testing, but…
There was a strong testing culture, but not around unit testing.
“Testing like a user would” often considered most important
With so much emphasis on the end user experience, many believed that “testing like a user would” was the most important kind of testing.
Tests often large, UI-driven, expensive, slow, flaky, and ineffective
As a result, most tests were user interface driven, requiring full application and system builds on real devices. Since writing smaller tests wasn’t often considered, this led to a proliferation of large, expensive, slow, unreliable, and ineffective tests, generating waste and risk.
“We’re the best” syndrome, deadline pressure
Rather than imposter syndrome, there was strong sense that we were already the best.
Smart people who hadn’t seen a different way
This led to a lot of smart people suffering because not enough of them even knew that better methods of improving quality existed.
The End of the Rainbow
Too much of a good thing, way too soon
In the beginning, I made the mistake of thinking the Rainbow of Death could help us accelerate adoption. I kept trying to use it as an answer key. I’d expect these “smart people” to “get it,” shortcut the exploration phase, get straight to implementation, and shave years off the process. However, I eventually realized that it’s too complicated a device to apply at the beginning of the change process.2
Instigating Culture Change
Essential needs an internal community must support
So instead, we focused on these essential needs to simplify our initial efforts:3
- Individual Skill Development
- Team/Organizational Alignment
- Quality Work/Results Visibility
Each part of the cycle gains momentum from the others,4 but it’s important to focus on completing one effort before launching the next.
Focus and Simplify
Build a solid foundation before launching multiple programs
Looking back, this is how the Testing Grouplet built up its efforts one step at a time, over time. We did try many things, some in parallel, but we tended to establish one major program at a time before focusing on establishing another.5
At Apple, I started off trying to use the Rainbow of Death to get too many projects started at once. We didn’t make much progress for about a year.6 Once I realized my mistake and confessed it to the Quality Culture Initiative, everyone agreed that we needed to focus and simplify our efforts.
Skill Development: Complete training curriculum and volunteer training staff
First, we launched a complete training curriculum with an all-volunteer training staff.
Alignment/Visibility: Internal podcast focused on producing regular episodes
Our internal podcast team then got serious about publishing episodes more regularly.
Alignment/Visibility: Quality Quest in one org, then spread to others via QCI
While I was focused on those programs, another core QCI member established the QCI’s version of Test Certified, Quality Quest, in his organization. We then merged Quality Quest back into the QCI mainstream, allowing it to spread to other organizations.
After that, we began experimenting again with other projects, some sticking, some not so much. Whenever a project seemed to stall, we’d invoke our “focus and simplify” mantra and pour that focus into more productive areas.
Apple: Quality Culture Initiative results
QCI activity as of November 2022—internal results confidential
It’s too early for the QCI to declare victory, and specific results to date are confidential. However, I can broadly describe the state of the QCI’s efforts by the time I left Apple in November 2022.
Training: 16 courses, ~40 volunteer trainers, ~360 sessions, ~6100 check-ins, ~3200 unique individuals
Our training program was wildly successful, with sixteen courses and dozens of volunteer trainers helping thousands of attendees improve their coding and testing capabilities.
Internal podcast: 45 episodes and 500+ subscribers
Our podcast series gave a voice to people of various roles from various organizations, helping drive a rich software quality conversation across Apple.
Quality Quest roadmap: ~80 teams, ~20 volunteer guides
Our Quality Quest roadmap, directly inspired by Test Certified, started helping teams across Apple improve their quality practices and outcomes.
QCI Ambassadors: 6 organizations started, 6 on the way
QCI Ambassadors now use these resources to help their organizations apply QCI principles, practices, and programs to achieve their quality goals.
QCI Roadshow: over 50 presentations
The QCI Roadshow helps introduce QCI concepts and programs directly to groups across the company.
QCI Summit: ~50 recorded sessions, ~60 presenters, ~850 virtual attendees
Our QCI Summit event recruited presenters from across Apple to make their quality work and impact visible. We saw how QCI principles applied to operating systems and services, applications, frontends and backends, machine learning, internal IT, and development infrastructure.
What’s in a name?
What we realized three years after choosing it
One nice feature about the name “Quality Culture Initiative” that we didn’t realize for three years was how it encoded the total Software Quality solution:
Quality is the outcome we’re working to achieve, but as I’ll explain, achieving lasting improvements requires influencing the…
Culture. Culture, however, is the result of complex interactions between individuals over time. Any effective attempt at influencing culture rests upon systems thinking, followed by taking…
Initiative to secure widespread buy in for systemic changes. Selling a vision for systemic improvement and supporting people in pursuit of that vision requires leadership.
Coming up next
There’s a lot to unpack when it comes to leading a culture change to make software quality visible. The next post will begin doing that by introducing the next major section, Skills Development, Alignment, and Visibility.
- [Individual Skills Development]
After hitting the wall for about the third time, at Apple, I eventually realized the Rainbow of Death wasn’t an answer key, but a blueprint. Yes, it can help trained experts understand what the finished structure looks like. However, it has to come together over time, with many adjustments along the way. You have to find and purchase a site, prepare the site, put in the framing, then the electrical and plumbing infrastructure, etc. You can’t have the bulldozers, construction workers, roofers, siders, painters, and interior decorators all start at the same time. And the assumption is they are all already knowledgeable in what they need to do, and they’re already bought into doing it.
Spreading adoption of good testing practices has its own order of dependencies—and you have to provide education and secure buy in as you go. Sharing the Rainbow of Death is fun and useful for existing Instigators, especially after completing the mission, showing how years of chaos converged into achievement. But it’s not the most effective tool for recruiting new Instigators and influencing the Early Majority. There really aren’t any shortcuts; it’s always going to take time.
In other words, my own idea about how to approach the problem using the Rainbow of Death needed to die, so new ideas could emerge. Specifically, I needed to set aside the complexity of the Rainbow of Death, and embrace the “focus and simplify” principle as a starting point instead. ↩
An Apple internal article used the example of Amundsen and Scott’s expeditions to the South Pole to illustrate the need to “focus and simplify.” Amundsen focused on getting there with the best sled dogs, succeeded on 1911-12-14, and survived. Scott tried a diversified approach, and did reach the South Pole on 1912-01-17, but he and his crew all died during the return trip.
Other articles outside Apple highlight other differences in the mindset and leadership styles between the two. Amundsen was adaptable to conditions beyond his control, learned from the wisdom of others, assembled the most skilled team possible, and paid attention to details. Scott didn’t heed the weather, was casual about team composition and details, and plowed ahead through sheer assertion of confidence. Like Feynman later warned, nature will not be fooled by public relations.
To recap the earlier footnote: I wrote The Rainbow of Death to describe how the Testing Grouplet made focused, deliberate progress over time. Ironically, I then kept trying to use the model in several organizations to launch a bunch of efforts in parallel from the very start. Thankfully, I finally learned my lesson at Apple. ↩