Formative Experiences at Apple

I left the industry after Google, but not for long. At Apple, the Quality Culture Initiative embraced the power of "Focus and Simplify."

16 Aug 2023
Tags: Apple, Making Software Quality Visible, QCI, Testing Grouplet, grouplets, testing

This sixth post in the Making Software Quality Visible series describes the next steps of my personal journey with software quality and automated testing.

I’ll update the full Making Software Quality Visible presentation as this series progresses. Feel free to send me feedback, thoughts, or questions via email or by posting them on the LinkedIn announcement corresponding to this post.

Picking up after the previous three posts:

One more time…

After the Testing Grouplet succeeded, I worked on websearch indexing for a couple of years. Then I burned out, dropped out, and tried the music thing again. It obviously didn’t really work out, or I wouldn’t be speaking to you now.

Berklee College of Music, 2013.
Click for a larger image.

Apple’s goto fail

Finding More Than One Worm in the Apple, CACM, July 2014

The beginning of my descent back into the tech industry began in February 2014, thanks to Apple’s famous “goto fail” bug.

Requirement	Apply algorithm multiple times
Assumption	Short algorithms safe to copy
Reality	Copies may not stay identical
Outcome	One of six copies had a bug
Impact	Billions of devices

Requirement: Apple had to update part of its open source Secure Transport component which applied the same algorithm in six places.

Assumption: The developers apparently assumed that this short, ten line algorithm was safe to copy in its entirety, instead of making it a function.

Reality: One problem with duplication is that the copies may not remain identical.

Outcome: As it so happened, one of the six copies of this algorithm picked up an extra “goto” statement that short circuited a security handshake.

Impact: Once it was discovered and patched, Apple had to push an emergency update to billions of devices. It’s unknown whether it was ever exploited.

The complexity produced by copying and pasting nearly-but-not-quite-identical code yielded poor quality that masked a horrific defect. My article “Finding More Than One Worm in the Apple” explains how this bug could’ve been caught, or prevented, by a unit test.

OpenSSL’s Heartbleed

Goto Fail, Heartbleed, and Unit Testing Culture, May 2014

Requirement	Echo message from request
Assumption	User-supplied length is valid
Reality	Actual message may be empty
Outcome	Server returns arbitrary data
Impact	Countless HTTPS servers

Requirement: In April 2014, OpenSSL had to update its “heartbeat” feature, which echoed a message supplied by a user request.

Assumption: The code assumed that the user supplied message length matched the actual message length.

Reality: In fact, the message could be completely empty.

Outcome: In that case, the server would hand back however many bytes of its own memory that the user requested, including secret key data.

Impact: Countless HTTPS servers had to be patched. It’s unknown whether it was ever exploited.

My article “Goto Fail, Heartbleed, and Unit Testing Culture” explains how this bug could’ve been caught, or prevented, by a unit test. It describes how the absence of a rigorous testing culture allowed a fundamentally flawed assumption to endanger the privacy and safety of millions. It also shows how to challenge such fundamental assumptions and to prevent them from compromising complex systems through unit testing specifically.

Apple: Quality Culture Initiative

2018-present

Shortly after that, I was lured back into technology and eventually ended up at Apple in November 2018, which I left in November 2022. At Apple, I joined forces with a few others¹ to start the Quality Culture Initiative, another volunteer group inspired by the Testing Grouplet.

Rapid growth, hiring the “best of the best,” build/test tools not scaling
When I joined Apple in 2018, the company was growing fast, and we knew we were “the best of the best.” However, our build and testing tools and infrastructure weren’t keeping up.
Widespread automated and manual testing, but…
There was a strong testing culture, but not around unit testing.
“Testing like a user would” often considered most important
With so much emphasis on the end user experience, many believed that “testing like a user would” was the most important kind of testing.
Tests often large, UI-driven, expensive, slow, flaky, and ineffective
As a result, most tests were user interface driven, requiring full application and system builds on real devices. Since writing smaller tests wasn’t often considered, this led to a proliferation of large, expensive, slow, unreliable, and ineffective tests, generating waste and risk.
“We’re the best” syndrome, deadline pressure
Rather than imposter syndrome, there was strong sense that we were already the best.
Smart people who hadn’t seen a different way
This led to a lot of smart people suffering because not enough of them even knew that better methods of improving quality existed.

The End of the Rainbow

Too much of a good thing, way too soon

In the beginning, I made the mistake of thinking the Rainbow of Death could help us accelerate adoption. I kept trying to use it as an answer key. I’d expect these “smart people” to “get it,” shortcut the exploration phase, get straight to implementation, and shave years off the process. However, I eventually realized that it’s too complicated a device to apply at the beginning of the change process.²

Instigating Culture Change

Essential needs an internal community must support

So instead, we focused on these essential needs to simplify our initial efforts:³

Individual Skill Development
Team/Organizational Alignment
Quality Work/Results Visibility

Each part of the cycle gains momentum from the others,⁴ but it’s important to focus on completing one effort before launching the next.

Focus and Simplify

Build a solid foundation before launching multiple programs

Looking back, this is how the Testing Grouplet built up its efforts one step at a time, over time. We did try many things, some in parallel, but we tended to establish one major program at a time before focusing on establishing another.⁵

At Apple, I started off trying to use the Rainbow of Death to get too many projects started at once. We didn’t make much progress for about a year.⁶ Once I realized my mistake and confessed it to the Quality Culture Initiative, everyone agreed that we needed to focus and simplify our efforts.

Skill Development: Complete training curriculum and volunteer training staff
First, we launched a complete training curriculum with an all-volunteer training staff.
Alignment/Visibility: Internal podcast focused on producing regular episodes
Our internal podcast team then got serious about publishing episodes more regularly.
Alignment/Visibility: Quality Quest in one org, then spread to others via QCI
While I was focused on those programs, another core QCI member established the QCI’s version of Test Certified, Quality Quest, in his organization. We then merged Quality Quest back into the QCI mainstream, allowing it to spread to other organizations.

After that, we began experimenting again with other projects, some sticking, some not so much. Whenever a project seemed to stall, we’d invoke our “focus and simplify” mantra and pour that focus into more productive areas.

Apple: Quality Culture Initiative results

QCI activity as of November 2022—internal results confidential

It’s too early for the QCI to declare victory, and specific results to date are confidential. However, I can broadly describe the state of the QCI’s efforts by the time I left Apple in November 2022.

Training: 16 courses, ~40 volunteer trainers, ~360 sessions, ~6100 check-ins, ~3200 unique individuals
Our training program was wildly successful, with sixteen courses and dozens of volunteer trainers helping thousands of attendees improve their coding and testing capabilities.
Internal podcast: 45 episodes and 500+ subscribers
Our podcast series gave a voice to people of various roles from various organizations, helping drive a rich software quality conversation across Apple.
Quality Quest roadmap: ~80 teams, ~20 volunteer guides
Our Quality Quest roadmap, directly inspired by Test Certified, started helping teams across Apple improve their quality practices and outcomes.
QCI Ambassadors: 6 organizations started, 6 on the way
QCI Ambassadors now use these resources to help their organizations apply QCI principles, practices, and programs to achieve their quality goals.
QCI Roadshow: over 50 presentations
The QCI Roadshow helps introduce QCI concepts and programs directly to groups across the company.
QCI Summit: ~50 recorded sessions, ~60 presenters, ~850 virtual attendees
Our QCI Summit event recruited presenters from across Apple to make their quality work and impact visible. We saw how QCI principles applied to operating systems and services, applications, frontends and backends, machine learning, internal IT, and development infrastructure.

What’s in a name?

What we realized three years after choosing it

One nice feature about the name “Quality Culture Initiative” that we didn’t realize for three years was how it encoded the total Software Quality solution:

Quality is the outcome we’re working to achieve, but as I’ll explain, achieving lasting improvements requires influencing the…
Culture. Culture, however, is the result of complex interactions between individuals over time. Any effective attempt at influencing culture rests upon systems thinking, followed by taking…
Initiative to secure widespread buy in for systemic changes. Selling a vision for systemic improvement and supporting people in pursuit of that vision requires leadership.

Coming up next

There’s a lot to unpack when it comes to leading a culture change to make software quality visible. The next post will begin doing that by introducing the next major section, Skills Development, Alignment, and Visibility.

Individual Skills Development

Footnotes

This included Jake Spracher and Kirk Russell, who both left Apple before I did. The other folks are still at Apple, and therefore I’ll leave them anonymous for now. ↩
After hitting the wall for about the third time, at Apple, I eventually realized the Rainbow of Death wasn’t an answer key, but a blueprint. Yes, it can help trained experts understand what the finished structure looks like. However, it has to come together over time, with many adjustments along the way. You have to find and purchase a site, prepare the site, put in the framing, then the electrical and plumbing infrastructure, etc. You can’t have the bulldozers, construction workers, roofers, siders, painters, and interior decorators all start at the same time. And the assumption is they are all already knowledgeable in what they need to do, and they’re already bought into doing it.

Spreading adoption of good testing practices has its own order of dependencies—and you have to provide education and secure buy in as you go. Sharing the Rainbow of Death is fun and useful for existing Instigators, especially after completing the mission, showing how years of chaos converged into achievement. But it’s not the most effective tool for recruiting new Instigators and influencing the Early Majority. There really aren’t any shortcuts; it’s always going to take time.

In other words, my own idea about how to approach the problem using the Rainbow of Death needed to die, so new ideas could emerge. Specifically, I needed to set aside the complexity of the Rainbow of Death, and embrace the “focus and simplify” principle as a starting point instead. ↩
An Apple internal article used the example of Amundsen and Scott’s expeditions to the South Pole to illustrate the need to “focus and simplify.” Amundsen focused on getting there with the best sled dogs, succeeded on 1911-12-14, and survived. Scott tried a diversified approach, and did reach the South Pole on 1912-01-17, but he and his crew all died during the return trip.

Other articles outside Apple highlight other differences in the mindset and leadership styles between the two. Amundsen was adaptable to conditions beyond his control, learned from the wisdom of others, assembled the most skilled team possible, and paid attention to details. Scott didn’t heed the weather, was casual about team composition and details, and plowed ahead through sheer assertion of confidence. Like Feynman later warned, nature will not be fooled by public relations.
- The Leadership Lessons of the Race to the South Pole
- South to the Pole: Leadership Wins the Race
↩
This is similar to Jim Collins’s Flywheel Effect. ↩
To recap the earlier footnote: I wrote The Rainbow of Death to describe how the Testing Grouplet made focused, deliberate progress over time. Ironically, I then kept trying to use the model in several organizations to launch a bunch of efforts in parallel from the very start. Thankfully, I finally learned my lesson at Apple. ↩
Greg Mckeown calls this “making a millimeter of progress in a million directions” in his book Essentialism: The Disciplined Pursuit of Less. ↩