Contract/Collaboration Tests and Internal APIs

Contract and Collaboration tests are medium sized tests that validate how one's own code interacts with an external dependency. Internal APIs are adapters that insulate most of your code from changes in such dependencies.

08 Sep 2023
Tags: Making Software Quality Visible, programming, technical, testing

This fourteenth post in the Making Software Quality Visible series describes key features of medium sized contract and collaboration tests.

I’ll update the full Making Software Quality Visible presentation as this series progresses. Feel free to send me feedback, thoughts, or questions via email or by posting them on the LinkedIn announcement corresponding to this post.

Continuing after Test Doubles…

Shoutout!

Shoutout to Francisco Candalija for bringing contract and collaboration tests to my attention. He influenced how I now think and talk about medium/integration tests and my own “internal API” concept. (Some of the below I also described in an email to my former Google colleague Alex Buccino on 2022-12-23.)

Contract and Collaboration Tests

Contract tests essentially answer the question: “Did something change that’s beyond my control, or did I screw something up?” while narrowing potential sources of error. They also help ensure our test doubles remain faithful to production behavior.

I like thinking of contract tests in this way rather than how Pact defines them, even though the Pact definition is very popular. Writing a contract test quickly using a special tool and calling it a day can provide a false sense of confidence. Such tests are prone to become brittle and flaky if one doesn’t consider how they support the overall architecture and testing strategy.

Internal APIs

An “internal API” is a wrapper that’s kind of a superset of Proxy and Adapter from Design Patterns. It’s an interface you design within your project that translates an external (or complicated internal) dependency’s language and semantics into your own custom version. Using your own interface insulates the rest of your code from directly depending on the dependency’s interface.

One very common example is creating your own Database object that exposes your own “ideal” Database API to the rest of your app. This object encapsulates all SQL queries, external database API calls, logging, error handling, and retry mechanisms, etc. in a single location. This obviates the need to pepper these details throughout your own code.

What this means is:

The internal API introduces a seam enabling you to write many more fast, stable, small tests for your application via dependency injection and test doubles. (Michael Feathers introduced the term “seam” in Working Effectively with Legacy Code.) This makes the code and the tests easier to write and to maintain, since the all the tests no longer become integration tests by default.
You do still need to test your API implementation against the real dependency—but now you have only one object to test using a medium/integration test. This would be your contract test.
Any integration problems with a particular dependency are detected by one test, rather than triggering failures across the entire suite. This improves the signal to noise ratio while tightening the feedback loop, making it faster and easier to diagnose and repair the issue.
The contract test makes sure any test doubles based on the same interface as the internal API wrapper are faithful to production. If a contract test fails in a way that invalidates your internal API, you’ll know to update your API and test doubles based on it.
If you want to upgrade or even replace a dependency, you have one implementation to update, not multiple places throughout the code. This protects your system against revision or vendor lock in.
In fact, you can add an entirely new class implementing the same interface and configure which implementation to use at runtime. This makes it easy and safe to try the old and new implementations without major surgery or risk.

For all these reasons, combining internal APIs with contract tests makes your test suite faster, more reliable, and easier to maintain.

A concrete example

Like many languages, Python provides a common DBAPI. This enables you to use a local, in memory database (typically SQLite) to fake (i.e. stand in for) a production database.

I did this not long ago for some Python code that threw a DBAPI error in production every few days, locking up our server fleet:

Though we used Postgres in prod, I could simulate the same DBAPI error in a test on my desk by using the standard sqlite3 module.
I reproduced the bug, in which the system didn’t abort a failed transaction due to a dropped connection, blocking further operations.

I wouldn’t call the test “small” or “medium,” but “small-ish.” It was as small a contract test as you could get, and while it wasn’t super fast, it was quite quick.
I fixed the bug—and the test—by introducing a Database abstraction that implemented a rollback/reconnect/retry mechanism. The relatively small size, low complexity, and quick speed of the test enabled me to iterate quickly on the solution.

(I also set a one hour timeout on database connections. This alone might’ve resolved the problem, but it was worth adding the new abstraction that provably resolved the problem.)
I shipped the fix—and bye bye production error! I kept monitoring the logs and never saw it happen after that.

This contract test enabled me to define an internal Database API based on the Python DBAPI. The DBAPI ensures that the Database API can be reused—and tested—with different databases that conform to its specifications. The rest of our code, now using the new Database object, could be tested more quickly using test doubles. So long as the contract test passes, the test doubles should remain faithful substitutes. And if we wanted to switch from Postgres to another production database, likely none of our code would’ve had to change.

The contract test did require some subtle setup and comments explaining it. Still, dealing with one such test and object under test beats the hell out of dealing with one or more large system tests. And it definitely beats pushing a “fix” and having no idea whether it stands a chance of holding up in production!

Coming up next

The next post will cover footnotes discussing why the main obstacle to replacing heroics with a Chain Reaction is belief, not technology.

Why the Chain Reaction hasn’t happened everywhere already