christian-thompson.net

My ideas about Agile, DevOps, and software development in general

Test the Leaves and the Trees

There are 2 main categories of automated tests – integration test and unit tests. Each has its strengths and limitations, and we need both when building high quality software. To make a forest-based analogy, if integration tests cover the trees, we need unit tests to hit the leaves. There should be a difference in the breadth of functionality tested by each kind of test, but it’s not uncommon for developers to get this wrong and write unit tests that act more like integration tests. I’ve seen 2 major reasons for this. The first is confusion about what a unit test should do and the second is a result of a code smell – big multi-responsibility classes.

The first scenario probably happens because it’s intuitive to write tests that simulates actions a user can take in the system (i.e. user stories). But unit tests should be written at the lowest possible level, ideally against a single responsibility method (i.e. a “unit”). This unit is almost always going to implement a lot less functionality than a complete user story. For example, say we have a user story to add an item to a cart on an e-commerce site. We could write an integration test for the story that adds the item and then checks that the list of cart items is updated, the shipping costs are updated appropriately, the new cart total is correct, and that all the other acceptance criteria are met. But we’d also want to write individual unit tests against all the functional units that make up this story – like one to test the functionality that sums up the cart total.

The second scenario happens when testing big, multi-responsibility classes. Most big classes have a lot of private methods that do most of the work of the class. The prevailing (and correct IMO) advice when writing unit tests is to test a class’s public interface. This helps to make the tests stable – we can completely change a class’s implementation without invalidating the tests written against it. This isn’t the case when we directly test private or protected methods where refactors are more likely to generate false positive test failures. But… if you have a big, multi-responsibility class, writing tests against the public interface doesn’t work well. In these classes, complex functionality is often buried 3 or more hops down the call stack and intermixed with other complex functionality – i.e. we’re back to testing too much functionality for a unit test.

What’s the big deal about unit tests anyway? If writing integration tests is so intuitive, why not just stick with them? The answer is that it’s virtually impossible to get full coverage on functionality with integration tests alone. To understand why, we need a take a quick detour to talk about exponential growth. Exponential growth is hard to fully wrap your head around. It’s an easy enough concept – most of us probably took enough high school math that we’re picturing a cartesian plane and a line curving up – but no matter how many popular science examples I run across, I’m always surprised about just how fast it grows.

One example I found on the internet involves placing grains of rice on a chess board (Wheat and chessboard problem – Wikipedia). Place one grain on the first square, 2 on the second square, 4 on the third, and keep doubling until board is full. What’s the total number of grains we’ll need? Maybe a few thousand? A few hundred thousand? Nope – the right answer is about 18.5 quintillion grains of rice. For comparison, there are 7.5 quintillion grains of sand on all the beaches in the world. We’d need almost 3 times that number to fill the chess board! As a said before, pretty mind boggling.

The point of that sidebar is that we see can see a similar effect when we look at a function call stack. Each method has a certain number of execution paths in it (also called its cyclomatic complexity) and some of these paths will call other methods. This continues until we get down to the leaves of the call tree. A user story’s implementation can easily involve 100+ methods. If 64 of these methods work together (probably an overestimate of reality) and all of them have a branching factor of 2 (probably an underestimate) then we’ve recreated the chessboard and rice situation. Bottom line, we don’t have the lifetimes it would take to hit all 18 quintillion of those paths if we started at the top of the tree.

On the other hand, it’s totally possible to test all the paths in all those methods if we test each of them individually. In this case, the number of paths grows linearly, which is a much friendlier way to grow :-). For example, if all 100 methods have 10 paths, we have 100 x 10 = 1000 total paths to test. That’s a lot, but doable with a few weeks of work.

So how can we unit test a big multi-responsibility class? I said before that I agree with the advice to test the public interface. But now I’m going to recommend testing all the private methods in these classes directly. Which is right? Well… it depends. For single responsibility classes, test the public interface for sure. But for big, multi-responsibility classes, we need to find a way to test methods individually in order to get anywhere near complete coverage. And I’d argue that these classes are the ones that need the coverage the most. Likely they’ve gotten big over time without unit tests and have turned to spaghetti. They’re hard to change and fragile – technical debt that needs to be tackled.

There are a few options for testing private methods that I know of:

  1. Some unit test frameworks have mechanisms to do it. The ones I’ve seen are ugly and require specifying the method name as a string so the framework can do some runtime magic. They won’t give you compilation errors if the method name or signature changes which removes one of our shortest feedback loops. Not good – avoid this option.
  2. Use a pass-through method. Make the method protected and create a test object that inherits from the object under test. Add a public method in the test object that calls the protected method and returns its result. This used to be my preferred method because it doesn’t completely blow away information hiding. But it’s more code and complexity to maintain and it’s still pretty ugly.
  3. Just flip the method to public and call it good. Terrible advice, right? Yes, but I think it’s the best of the bad options we have available to us. We lose information hiding but remember this is in a class with a ton of code in it. That private method is already accessible to 100+ methods and thousands of lines of code, so not very hidden after all.

All three of these methods are ugly solutions to an ugly problem. They allow us to get good coverage on previously untestable code, but we’ve violated information hiding and made our tests less stable. So don’t look at this as a final step – it’s an intermediate step to allow you to refactor more safely. Get the coverage and then start refactoring all that logic into several single-responsibility classes that can be tested correctly.

Phew! So now we’ve “tested the leaves.” What about the trees? Unit tests are really great at getting full coverage on every path in the system. But they aren’t perfect – they don’t test how classes and methods work together to implement a user story and they don’t hit external dependencies like the database or the file system. To do those things you need integration tests, like automated UI and API tests and high-level unit tests. Think of them as the glue that binds all the unit tests together to form a complete picture.

So coming full circle, we need high-level integration tests and low-level unit tests – the trees and the leaves. We need integration tests to test user stories front to back, including calls to external dependencies. And we need unit tests to give ourselves a chance at getting complete coverage of a system’s functionality. One or the other is certainly better than nothing, but we need both to take quality to the next level.