A Burning Issue
We interrupt the current blog thread (the “E” word series) to bring you a burning issue. Well, burning for me, anyway.
I have been working with some other Pillar programmers on systems for helping not-yet-agile programmers learn some best practices. And while many of us in the industry are accustomed to coaching, mentoring, training, and otherwise cajoling people to attempt TDD specifically as a practice, I recently have begun to suspect that in fact, that’s a poor place to start the conversation.
TDD is all about how you get a good design, and good tests as specifications, and most critically to my mind, a great xUnit test suite for its regression protection. But what is a great xUnit test suite? What does that look like?
I have been finding (but not grokking until recently) that before I can have a TDD conversation with anyone, I really have to have a good conversation about the characteristics and value of a great xUnit suite.
Characteristics of a Great xUnit Test Suite
So when I come across a fresh codebase (I mean fresh to me — it might actually be quite rotten), these are the things I want to see in the xUnit tests. In future posts, I can give these more discussion, and perhaps include code snippets, but for today, it’s just a list:
- Code coverage is no lower than 85%. (Note: As important as code coverage is — especially for teams new to xUnit best practices — it can be a dangerous narcotic. It can hide bigger problems. It is possible to have a test suite that provides 100% coverage that is about 100% crappy. People do things like comment out all assertions except assertNotNull(blah), and make other poor choices when under pressure to (A) keep the coverage rates up, and (B) get the features out the door.)
- As much of the testing as possible is accomplished by “isolation tests”; small unit tests that run entirely in memory, with no dependencies on file systems, networks, databases, or other external resources. This is Mike Feathers’ definition of a unit test. This level of isolation (and the execution speed that goes with it) in turn depend on proper use of static and dynamic mocks. That in turn depends on dependency injection, which in turn depends on people knowing enough OO to code to interfaces.
- Speaking of execution speed: isolation test suites should average no more than 0.5 seconds per test, on a crappy machine. If everything really is in memory, it’s pretty common to get speeds of more like 100 isolation tests per second.
- The suite also includes end-to-end tests, “collaboration tests,” and other tests that are more real-world than isolation tests, include less or no mocking, and take longer to set up and run. These tests do talk to real databases, real networks, and perhaps completely external systems through various APIs.
- The isolation tests and non-isolation tests are separate from each other (separate source folders, to my mind), so that they can easily be run separately by developers, and by a CI server. As projects grow, the speed of their non-isolation suites slows. Because we don’t want to discourage programmers from running isolation test suites frequently, we want to keep the isolation test execution speed fast. We also want to keep the build nice and fast. So we want to be able to run slower non-isolation suites separately, and perhaps less frequently. So if the slow tests run slowly enough, we may not make them part of each CI build, but instead run them every few hours, or overnight, in a separate CI target.
- Each test method involves only one cycle of Arrange/Act/Assert (setup and instantiation, getting to the testable state, and verifying that state).
- Each isolation test method isolates a thin slice of system behavior. One industry term for this (proposed by Industrial Logic) is “micro-tests.”
- Average length of test methods is under 20 lines, ideally fewer than 10 lines.
- Test methods and TestCase classes are written and organized in terms of system behavior, not system structure. Related to this: all the test methods in a TestCase use the code in the setUp() method in that class, with as little addition test-specific setup as possible. All of the “Arrange” part of “Arrange/Act/Assert” really should be handled in the setUp() method, whenever possible.
- TestCases systematically cover unhappy paths: exception cases, edge cases and boundary conditions, etc. Mocks/fakes are used to simulate failure of external dependent resources.
- TestCase object trees make effective use of base TestCase classes, and make good use of reusable, private or protected helper methods (a sort of local testing DSL). Or, as Ryan points out in the comment below, the TestCases all use a separate object tree that holds a well-thought-out, rich little local testing DSL, completely decoupled from the test code. The more of that DSL pattern you need, as Ryan might say, the less you want to use inheritance, and the more you want to use composition.
- Test suites manage test data centrally (the repository of canonical test data might be a static class full of constants, or an in-memory database, or whatever). TestCases and test methods avoid primitive type literals wherever possible, and likewise avoid duplicate local variables and constants.
- Test suites, TestCase classes, and test methods contain as little duplicate code as possible. This includes small details like recurring complex assertion patterns that can be extracted, repeating the name of the TestCase in a test method name, etc.
- TestCase classes and Test methods have intention-revealing names, and use a consistent naming convention.
- Test suites are designed to be as resistant as possible to production code design changes. They are robust, not brittle.
- Test suites test the hard and harder things: xml configuration files, servlets, Swing GUIs, Jsp files,etc.
I’ve gathered up this first-draft list of characteristics from multiple sources — books, others’ experience, and my own experience. I’m sure I’m missing a few things in there — I’ll add and prune according to my future thinking and your comments.
Paint the Fence; Sand the Floor
Before people can talk to me with authority about the value of TDD, they need to talk with authority about the value of a great xUnit test suite. And before they can do that, they need to have (as my late mother would have said) suffered enough. They need to have suffered at the hands of codebases without great xUnit suites. They also need to have had their bacon saved by great xUnit suites.
So before we get to the TDD conversation, I increasingly want to encourage programmers new to xUnit testing practices to shoot for an xUnit test suite with the above characteristics. I don’t especially care, at first, how or why they paint the fence (from The Karate Kid), as long as they do it. I would in fact prefer that life and code provide them with the painful, indelible lessons that go with good and bad xUnit test suites.
THEN, once they have felt how hard it is to get that great xUnit suite when they have to stop, go back, retrofit tests to existing code. And once they have felt how hard it is to debug an “Eager Test” (from Gerard Meszaros great book on refactoring xUnit tests). THEN we can talk about how, hey, you know, if that great xUnit test suite is your goal, then my experience has been that TDD gets me there better and faster.
Now we are painting the fence in a specific way.
But along the way, it’s all good.