The Pseudo-xUnit Pattern

While writing the tests for the "goto fail" and Heartbleed bugs, I stumbled upon an xUnit-like pattern for writing tests without a framework.

05 Jun 2014 - Boston
Tags: Heartbleed, OpenSSL testing, goto fail, technical

One of my design goals for writing the “goto fail" unit test and the Heartbleed unit test (submitted to OpenSSL as ssl/heartbeat_test.c) was to avoid importing any testing frameworks or tools into the existing build environment. However, being an ardent user of Google Test (and its companion Google Mock for years, the influence of xUnit came shining through in the resulting test code.

What’s more, myself and others have warned against the dangers of data-driven tests, and I’ve lamented that my love for the Go programming language is slightly tainted by its advice to resort to table-driven tests. The pattern I discovered in these unit tests struck me, after the fact, as a possible solution to the problem of data-driven/table-driven tests, regardless of whether a testing framework is used or not.¹

What follows is an illustration of the Pseudo-xUnit pattern based on a mapping from the general xUnit concepts outlined in the Wikipedia entry on xUnit to their implementation in my Heartbleed unit test.

Differences from xUnit
Assertions
Test Result Formatter
Test Fixtures
Test Suites
Test Execution
Test Cases
Test Runner
Conclusion
Footnotes

Differences from xUnit

In typical xUnit implementations in Object-Oriented languages, there is some sort of TestFixture base class. The author of a test inherits from this class, overrides its SetUp() and TearDown() methods, and then adds test cases as member functions. In Google Mock for C++, one can also override the constructor or destructor instead of SetUp() and TearDown(), and each new test case is defined as a subclass of the user-defined fixture. However it’s implemented, the framework guarantees that SetUp() is called before each test case and TearDown() is called after, that every test case is automatically executed, and that errors are reported in some standard format.

These features make xUnit-based testing frameworks very convenient, as the xUnit idiom is familiar to many programmers, ensures that a lot of mechanical details are handled automatically, and error reporting is standardized across all tests using the framework. That’s why I call the pattern described below “Pseudo-xUnit”, because it’s up to the test author to ensure that SetUp() and TearDown() are called from each test case, to write out all of the assertions and error messages in an execution function, and to remember to add each test case function to the test runner (i.e. main()). Still, the resulting code organization is very similar to xUnit, which makes tests written in this style more readable, especially to people used to xUnit-based frameworks in other languages.

Assertions

The Pseudo-xUnit pattern does not rely on standard assertions. The execution function will contain all of the assertions for all of the test cases.

Test Result Formatter

The Pseudo-xUnit pattern does not rely on a standard formatter. The execution function will define the result formatting for all of the test cases.

Test Fixtures

The “test fixture” is the data structure and set of associated operations that establish the environment for each test case. The structure, as in many data-driven tests, will contain the test inputs and expected outputs (and, in this case, a pointer to the code under test as process_heartbeat):

typedef struct {
  SSL_CTX *ctx;
  SSL *s;
  const char* test_case_name;
  int (*process_heartbeat)(SSL* s);
  unsigned char* payload;
  int sent_payload_len;
  int expected_return_value;
  int return_payload_offset;
  int expected_payload_len;
  const char* expected_return_payload;
} HeartbleedTestFixture;

Unlike data-driven tests, we will not define an array of these structures and loop through them. Instead, we will define a SetUp() function that each test case will call to create a new structure²:

static HeartbleedTestFixture SetUp(const char* const test_case_name,
    const SSL_METHOD* meth) {
  HeartbleedTestFixture fixture;
  int setup_ok = 1;
  memset(&fixture, 0, sizeof(fixture));
  fixture.test_case_name = test_case_name;

  fixture.ctx = SSL_CTX_new(meth);
  if (!fixture.ctx) {
    fprintf(stderr, "Failed to allocate SSL_CTX for test: %s\n",
            test_case_name);
    setup_ok = 0;
    goto fail;
  }

  /* snip other allocation and error handling blocks */

fail:
  if (!setup_ok) {
    ERR_print_errors_fp(stderr);
    exit(EXIT_FAILURE);
  }
  return fixture;
}

Note that the fixture is copied by value; no need to mess with more memory management than is necessary. There’s also an associated TearDown() function to ensure proper resource cleanup:

static void TearDown(HeartbleedTestFixture fixture) {
  ERR_print_errors_fp(stderr);
  SSL_free(fixture.s);
  SSL_CTX_free(fixture.ctx);
}

Test Suites

All of the tests in a suite will share a common prefix:

/* The test cases in the Dtls1 suite */
static int TestDtls1NotBleeding()
static int TestDtls1NotBleedingEmptyPayload()
static int TestDtls1Heartbleed()
static int TestDtls1HeartbleedEmptyPayload()
static int TestDtls1HeartbleedExcessivePlaintextLength()

/* The test cases in the Tls1 suite */
static int TestTls1NotBleeding()
static int TestTls1NotBleedingEmptyPayload()
static int TestTls1Heartbleed()
static int TestTls1HeartbleedEmptyPayload()

Multiple test suites can share the same underlying fixture structure and can “inherit” base fixture behavior by defining their own variations of the SetUp() function (and possibly TearDown()) that make a call to the “base” version:

static HeartbleedTestFixture SetUpDtls(const char* const test_case_name) {
  HeartbleedTestFixture fixture = SetUp(test_case_name,
                                        DTLSv1_server_method());
  fixture.process_heartbeat = dtls1_process_heartbeat;

  /* As per dtls1_get_record(), skipping the following from the beginning of
   * the returned heartbeat message:
   * type-1 byte; version-2 bytes; sequence number-8 bytes; length-2 bytes
   *
   * And then skipping the 1-byte type encoded by process_heartbeat for
   * a total of 14 bytes, at which point we can grab the length and the
   * payload we seek.
   */
  fixture.return_payload_offset = 14;
  return fixture;
}

static HeartbleedTestFixture SetUpTls(const char* const test_case_name) {
  HeartbleedTestFixture fixture = SetUp(test_case_name,
                                        TLSv1_server_method());
  fixture.process_heartbeat = tls1_process_heartbeat;
  fixture.s->handshake_func = DummyHandshake;

  /* As per do_ssl3_write(), skipping the following from the beginning of
   * the returned heartbeat message:
   * type-1 byte; version-2 bytes; length-2 bytes
   *
   * And then skipping the 1-byte type encoded by process_heartbeat for
   * a total of 6 bytes, at which point we can grab the length and the payload
   * we seek.
   */
  fixture.return_payload_offset = 6;
  return fixture;
}

Test Execution

The “execution function” prepares helper data based on input values from the test fixture, executes the code under test (fixture.process_heartbeat(s) in this example), performs all of the assertions and handles their output formatting. It returns zero on success and one on failure, no matter how many assertions may have failed:

static int ExecuteHeartbeat(HeartbleedTestFixture fixture) {
  int result = 0;
  SSL* s = fixture.s;
  unsigned char *payload = fixture.payload;
  unsigned char sent_buf[kMaxPrintableCharacters + 1];

  s->s3->rrec.data = payload;
  s->s3->rrec.length = strlen((const char*)payload);
  *payload++ = TLS1_HB_REQUEST;
  s2n(fixture.sent_payload_len, payload);

  /* Make a local copy of the request, since it gets overwritten at some
   * point */
  memcpy((char *)sent_buf, (const char*)payload, sizeof(sent_buf));

  int return_value = fixture.process_heartbeat(s);

  if (return_value != fixture.expected_return_value) {
    printf("%s failed: expected return value %d, received %d\n",
           fixture.test_case_name, fixture.expected_return_value,
           return_value);
    result = 1;
  }

  /* If there is any byte alignment, it will be stored in wbuf.offset. */
  unsigned const char *p = &(s->s3->wbuf.buf[
      fixture.return_payload_offset + s->s3->wbuf.offset]);
  int actual_payload_len = 0;
  n2s(p, actual_payload_len);

  if (actual_payload_len != fixture.expected_payload_len) {
    printf("%s failed:\n  expected payload len: %d\n  received: %d\n",
           fixture.test_case_name, fixture.expected_payload_len,
           actual_payload_len);
    PrintPayload("sent", sent_buf, strlen((const char*)sent_buf));
    PrintPayload("received", p, actual_payload_len);
    result = 1;
  } else {
    char* actual_payload = strndup((const char*)p, actual_payload_len);
    if (strcmp(actual_payload, fixture.expected_return_payload) != 0) {
      printf("%s failed:\n  expected payload: \"%s\"\n  received: \"%s\"\n",
             fixture.test_case_name, fixture.expected_return_payload,
             actual_payload);
      result = 1;
    }
    free(actual_payload);
  }

  if (result != 0) {
    printf("** %s failed **\n--------\n", fixture.test_case_name);
  }
  TearDown(fixture);
  return result;
}

There can be multiple execution functions if need be, and possibly multiple separate assertion functions. Also, while in this version the execution function calls TearDown(fixture), the actual OpenSSL version of the test has the test cases execute TearDown() (via a macro defined in the test file).

Test Cases

Each test case is a function that constructs and sets up a test fixture instance, overrides the fixture structure with inputs and expected outputs for that specific test case, and calls the execution function and TearDown() (either directly or as part of the execution function). It returns the result of the execution function, which will be zero on success and one on failure:

static int TestDtls1Heartbleed() {
  HeartbleedTestFixture fixture = SetUpDtls(__func__);
  /* Three-byte pad at the beginning for type and payload length */
  unsigned char payload_buf[] = "   HEARTBLEED                ";

  fixture.payload = &payload_buf[0];
  fixture.sent_payload_len = kMaxPrintableCharacters;
  fixture.expected_return_value = 0;
  fixture.expected_payload_len = 0;
  fixture.expected_return_payload = "";
  return ExecuteHeartbeat(fixture);
}

Notice the passing of __func__ to the SetUp() function. This is how the fixture gets initialized with the name of the test function for better assertion failure formatting. (Note that on Windows, __FUNCTION__ should be used instead.)

The other feature of the test cases is that they look very similar to one another; this is a deliberate violation of the Don’t Repeat Yourself principle, because defining test cases is the one place where code duplication makes sense. There will be a lot in common between some test cases, though every test case will have at least one detail different from all others. Having the test cases look so similar to one another not only allows for the rapid generation of test case functions, but when test cases are small and self-contained, it also becomes easier to take stock of the similarities and differences by scanning them.

Only declarative statements are getting duplicated, not any detailed program logic. Assertion macros or functions would be OK to duplicate as well, since their intent is basically declarative.

Also notice that, were this a data-driven test, the test fixture structure used in this function likely would have been declared thus (replacing the SSL_CTX and SSL members with DTLSv1_server_method):

{ "TestDtls1Heartbleed", dtls1_process_heartbeat, DTLSv1_server_method, 
  "   HEARTBLEED                ", kMaxPrintableCharacters, 0, 14, 0, "",
},

The member-by-member initialization of the Pseudo-xUnit version is easier to read in general, which makes it easier to work with when a test case fails, or when defining a new test case as a variation on an existing one. The Pseudo-xUnit version is easier to update when a new member is added, since the common SetUp() function can define a default value for the new member, and only the test cases for which that new member is relevant need to be updated. At the same time, the essence of the data-driven/table-driven structure is preserved, with the execution function containing all of the common execution and assertion logic.

Test Runner

The test runner can be as straightforward as a main() function that adds up all of the return values from each test case and reports the number of test cases that failed:

int main(int argc, char *argv[]) {
  SSL_library_init();
  SSL_load_error_strings();

  int num_failed = TestDtls1NotBleeding() +
      TestDtls1NotBleedingEmptyPayload() +
      TestDtls1Heartbleed() +
      TestDtls1HeartbleedEmptyPayload() +
      /* The following test causes an assertion failure at
       * ssl/d1_pkt.c:dtls1_write_bytes() in versions prior to 1.0.1g: */
      (OPENSSL_VERSION_NUMBER >= 0x1000107fL ?
           TestDtls1HeartbleedExcessivePlaintextLength() : 0) +
      TestTls1NotBleeding() +
      TestTls1NotBleedingEmptyPayload() +
      TestTls1Heartbleed() +
      TestTls1HeartbleedEmptyPayload() +
      0;

  ERR_print_errors_fp(stderr);

  if (num_failed != 0) {
    printf("%d test%s failed\n", num_failed, num_failed != 1 ? "s" : "");
    return EXIT_FAILURE;
  }
  return EXIT_SUCCESS;
}

The test runner can be arbitrarily more complex than this, but something like the above will likely do the trick most of the time. The trickiest part is remembering to add each new test case to main().

Conclusion

Despite the fact that this unit test was written in C, which has no direct support for Object-Oriented Programming, I was able to get nearly all of the benefits of the xUnit framework without any special tools or macro magic, and the code is nearly as clean and well-organized as it would be in any other language. I’m hoping that we will be able to apply this pattern to great effect in the effort to improve OpenSSL’s unit/automated test coverage, and that these ideas may be of use to others wishing to introduce unit testing to an existing code base without having to learn any new frameworks or tools.

Footnotes

Since I always have to mention it whenever I mention Go, do check out Aaron Jacobs’s ogletest framework as well as Go’s built-in coverage tools. ↩
I’m surprised that no one has mentioned anything to me about my tongue-in-cheek use of goto fail statements in my Heartbleed unit test. ↩