Test Doubles

Test Doubles are lightweight, controllable objects that replace production dependencies in smaller tests. Adding seams in your logic to accommodate them enables much faster, more reliable, more thorough testing.

06 Sep 2023
Tags: Making Software Quality Visible, guitar gear, programming, technical, testing

This thirteenth post in the Making Software Quality Visible series describes the different kinds of Test Doubles and why they’re useful.

Image of a Marshall MS-2 Micro Amp compared to a wall of Marshall stacks,
illustrating the principle of Test Doubles implementing a production dependency
interface.
Like test doubles, practice amps make it faster, easier, and less nerve-racking to check your work before going into production.
This image was derived from:
Marshall MS-2 Micro Amp (direct link); Reverb: A History Of Marshall Amps: The Early Years (direct link).

I’ll update the full Making Software Quality Visible presentation as this series progresses. Feel free to send me feedback, thoughts, or questions via email or by posting them on the LinkedIn announcement corresponding to this post.

Continuing after The Test Pyramid and the Chain Reaction…

Definition

Test doubles are lightweight, controllable objects implementing the same interface as a production dependency. This enables the test author to isolate the code under test and control its environment very precisely via dependency injection.

“Dependency injection” is a fancy term for passing an object encapsulating a dependency as a constructor or function argument. Doing this instead of instantiating or accessing the dependency directly creates a seam, allowing a test double to stand in for the dependency. (Dependency injection frameworks exist for some languages, but while some may find them convenient, they’re not strictly necessary. I’ve never used any, and I inject dependencies like crazy.)

I also defined test doubles on slide 49 of Automated Testing—Why Bother?:

Test doubles are substitutes for more complex objects in an automated test. They are easier to set up, easier to control, and often make tests much faster thanks to the fact that they do not have the same dependencies as real production objects.

Types of Test Doubles

Some of what follows owes a debt to Martin Fowler’s Mocks Aren’t Stubs.

The various kinds of test doubles are:

Dummy: A placeholder value with no bearing on the test other than enabling the code to compile
Stub: An object programmed to return a hardcoded or trivially computed value
Spy: A stub that can remember how many times it was called and with which arguments
Mock: An object that can be programmed to validate expected calls in a specific order, as well as return specific values
Fake: An object that fully simulates a production dependency using a less complicated and faster implementation (e.g., an in memory database or file system, a local HTTP server)

Much more than “Mocks”

People often call all test doubles “mocks,” and packages making it easy to implement test doubles are often called “mocking libraries.” This is unfortunate, as mocks should be the last option one should choose.

Mocks can validate expected side effects (i.e., behaviors not reflected in return values or easily observable environmental changes) that other test doubles can’t. However, this binds them to implementation details that can render tests brittle in the face of implementation changes. Tests that overuse mocks in this way are often cited as a reason why people find test doubles and unit testing painful.

A practical analogy

My favorite concrete, physical example of using a test double is using a practice amplifier to practice electric guitar:

You can practice in relative quiet, using something as small as a Marshall micro amp, a Blackstar Fly 3, or a Mustang micro. You do this before playing with others or getting on stage to make sure you’ve got your own parts working. If anything sounds bad, you know it’s all your fault.

This is analogous to writing small tests, with the practice amp as the test double. You’re figuring out immediately if your own performance meets expectations or needs fixing—without bothering anyone else.

Images derived from:
Marshall MS-2 Micro Amp (direct link); Blackstar Amps - Fly Series (direct link); Fender Mustang Micro (direct link)
You can rehearse with your band using a larger, louder amplifier, like a Marshall DSL40CR, Fender Blues Junior IV, or Fender Mustang GTX100. Or perhaps you’d prefer a Marshall Studio Vintage 20 or a Paul Reed Smith HDRX 20. This enables you to work out issues with other players before getting on stage. If you’ve practiced your parts enough and something sounds bad at this point, you know something’s wrong with the band dynamic.

This is analogous to writing medium tests, with the slightly larger amp still acting as a test double. You’re figuring out with your bandmates specific issues arising from working through the material together. You can start, stop, and repeat as often as necessary without burdening the audience.

Images derived from:
Marshall DSL40CR (direct link); Fender Blues Junior IV (direct link); Fender Mustang GTX100 (direct link); Marshall SV20H (direct link); Paul Reed Smith HDRX 20 (direct link)
You and the band can then run through a soundcheck on stage, making sure everything sound good together while plugging into your Marshall stacks. Everyone else will be using their production gear, the full sound system, and the lighting rig, in the actual performance space. If the band is well rehearsed but something sounds wrong at this level, you know it’s specific to the integration of the entire system.

This is analogous to writing large tests. You’re using the real production dependencies and running the entire production system. However, this is still happening before the actual performance, giving you a chance to detect and resolve showstopper issues before the performance.

Image from: Reverb: A History Of Marshall Amps: The Early Years (direct link).
Finally, you play in front of the audience. Things can still go wrong, and you’ll have to adapt in the moment and discuss afterwards how to prevent repeat issues. However, after all the practicing, rehearsals, and soundchecks, relatively few things could still go wrong, and are likely unique to actual performance situations.

This is analogous to shipping to production. You can’t expect perfection, and you may discover new issues not uncovered by previous practicing, rehearsing, and testing. However, you can focus on those relatively few remaining issues, since so many were prevented or resolved before this point.

Of course, there are more options than this. There’s nothing saying you couldn’t use any of these amplifiers in any other situation—you could use, say, the Fender Mustang GTX100 for everything. It can even plug directly into the mixing deck and emulate a mic’d cabinet. But hopefully the point of the analogy remains clear: The common interface gives you the freedom to swap implementations as you see fit.

What kind is it?

The only question is, what kind of “test double” is a practice amplifier? Based on the definitions above, my money’s on calling it a “fake.” It’s a lighter weight implementation of the full production dependency, with essentially the same interface, modulo EQ and volume controls and exact vacuum tube reactivity. Even so, for most practical purposes when it comes to practicing, it’s close enough, and there’s no interface for preprogramming responses.

Coming up next

The next post will discuss Contract and Collaboration Tests. These tests essentially answer the question: “Did something change that’s beyond my control, or did I screw something up?” while narrowing potential sources of error. They also help ensure our test doubles remain faithful to production behavior.

Contract/Collaboration Tests and Internal APIs