Friday, October 29, 2004

Approaches To Unit Testing

    I recently was involved in a discussion about unit testing.  I'll simplify the issues.  There are many more aguments each way and the area is more complex than presented here.  I will lay out the supposed advantages and disadvantages of side of the issue.  
    Unit tesing has been made quite popular lately with the advent of XP (eXtreme Programming).  The idea is fairly simple.  Have developers write tests to verify that their code works as intended.  It is after this point, that views diverge.  Well, some diverge before that and think that developers shouldn't be bothered to write tests but that, as they say, is a topic for another day.  The XP community seems to think that unit testing should be done at a very granular level.  Each individual function or object should be tested.  Others think that the unit tests should be more wholistic.  
    Granular unit testing is often synonymous with unit testing.  In it, developers test their code directly, trying each input and failing if the expected output isn't returned.  Often scaffolding is used to instantiate the object or call the function apart from the surrounding code.  Mock objects might also be used to insulate the real code from its reliance upon systems below, above, or beside it.  The advantage of this technique is that it is fast and thorough.  Each function, method, or class can be tested fully.  Each piece of code, even those difficult or impossible to reach from the public interfaces, can be tested.  The disadvantage is that testing each piece in isolation doesn't test the system as a whole.  It doesn't test the interaction between the code that will be interacting in the real world.  It is also not useful to test the shipping binary.  As each object is tested by standalone code, you don't get to see how the whole system really works.
    The other form of unit testing is testing only from exposed interfaces.  This is sometimes known as functional testing.  The only item being tested is the shipping binary and the only entry points used are those available on that shipping binary.  The advantages of this type of testing are that it tests the whole system.  It tests the interaction between each part of the system as it will be used in the real world.  These tests are also easily utilized by the test team to run as part of their testing suite.  The disadvantage is that it can be hard or even impossible to test many of the system internals.  Sometimes a simple interface can have a lot of code hidden behind it.  It also requires a greater amount of code to be written before the testing can even begin.
    In this particular discussion that I had, the advocacy was for the second type of testing only.  I advocated a blended approach thinking that would cover all bases.  It was argued that the granular form of testing tests at too low a level--that most bugs are found in the interaction between components and not in the functions themselves.  It therefore made little sense to even write them.    
    What do you think?  Do isolated unit tests have much return on investment or are they better off left undone?  Have you had any experience with either of these approaches?  If so, did it work out well or poorly?


  1. There has to be a combination. Isolated tests can work well *if* (and it's a very big if) the developers know how to write test cases. Too often developers don't know how to do this, and restrict their tests to the obvious and to the use cases in the functional specs, missing out things like passing obviously bogus parameters, or just testing known cases that will pass. This gives a big false sense of security; "Hey my code passes my tests, it must be right".

    But done properly it can be very useful, the number of times I forget to actually wireup properties is embarrassing.

  2. I think a combination of both approaches is best. The developer must be as sure as he can be that the functions do what they're supposed to and are resilient to the expected classes of errors, ideally before checking in. Otherwise, the point of the functional tests (to expose integration problems and specification deficiencies) is lost. Having both types of tests allows easier pinpointing of failures when they do occur: if the unit tests indicate that a component is working correctly with correct input but it's failing in this case, the input to that component must be considered suspect.

    Insufficiently tested components can lead to problems when those components are used. A developer can come to rely on bugs in other components; if those bugs are fixed due to problems revealed by bug reports in another area, it could break this component.

    Agile methods often encourage refactoring code to make it more understandable. Whenever you're editing code, there's a risk of introducing bugs. Unit testing is therefore recommended to verify that bugs are not introduced during refactoring or enhancement.

  3. I simply don't agree with the assertion that most bugs only occur in object interaction.

    The more accurate observation is that without granular unit testing, bugs in individual classes or methods only surface during intergration. And by that time, it has become much more diffucult to trace a problem to its root, since there are so many classes involved, all untested.

    Using granular testing, you ensure that each object behaves exactly as they are intended to. So that by integration time, the problems you discover would be truly integration-related, as opposed to some silly coding error somewhere that could have been easily detected by granular testing.

    Integration issues should be tested separately. Don't forget, however, that if objects don't integrate well it's most likely due to a flaw in the architecture. A well thought-out up-front design would therefore do wonders to prevent them from happening in the first place.

    That said, it is interesting to note the extent to which XP diehards disparage up-front design.

  4. For people getting started with unit testing, I have two big recommmendations. First: always write tests before code. Second: do it incrementally, so that you write a test and spend 5-10 minutes making it pass. If you do this, you'll discover that you will end up with some tests at both unit and integration levels. Why? Because to get an app working you need to code at different levels of detail.

    An approach like this means that every significant lump of code is tested. For significant code at the lower levels, those tests give you confidence that you're building on a strong foundation. And for significant code at upper levels, tests give you confidence that the system really does the thing it's supposed to. In my experience (4+ years of XP-ish development, lots of traditional development), both are valuable.

    And I think it's worth going further. As you do this, you'll discover that tests aren't just to make sure nothing breaks. They also serve as a computer-verifiable specification. So for objects or systems that others will need documentation for, it pays to polish and extend your test suite to make it as clear and readable as a good spec. For me, this generally involved going back through and rounding things out so that low-level behavior is directly tested in low-level tests, even though higher-level tests happen to verify it already.

  5. It's always a bit of a hard one to be disciplined enough to go that little bit extra to create automated tests. In my opinion using things like JUnit, although good due to a standards point of view, is probably a bad thing for those new to testing because it's so useless for testing anything remotely complex and new users will usually give up in a short amount of time.

    It's also hard convincing people of the merits of automated testing on both a unit level and a functional level until they have to make some changes and they see that they have greater confidence that they haven't borken anything. People are often quite happy to be mindless drones, creating the same input data over and over to test something. What suffers eventually is the frequency of testing, as it will soon get too tedious and get forgotten.

    I think there is merit to any automated testing, regardless the level because it means that those tests can be run again and again at almost zero cost in time. So low level stuff is good because it makes the developer think about making solid small pieces, and the higher level stuff is important also because all the unit testing in the world won't help you verify that the system works from start to finish.

    So long as it's easy to run the tests, any and all automated tests are useful (so long as they test SOMETHING). They don't assure that you've got bug free code, but they increase confidence, and if they break then it's definitely an indication that something needs looking at.

    So yes, and yes.. :)