Friday, July 20, 2007

Keep Your BVTs Clean

At Microsoft we build each of our products on a daily basis.  After each successful build, we run a series of automated tests we tend to call BVTs (Build Verification Tests).  If the BVTs fail, no one further testing is done and developers are called in to fix the issue ASAP.  The idea is simple but trying to actually implement it can reveal unexpected complexities.  One point that is often not considered is what tests to put in the BVT. 

It is sometimes tempting to put all of your automated tests into a BVT.  If they don't take too long to run, why not?  It is important to have only critical tests in your BVT suite.  The reason is that BVTs are supposed to be like the coal-miner's canary.  If they fall over, there is danger ahead.  Drop everything and fix the issue.  If you put all of your automated tests into the BVTs, you'll have lots of non-critical failures.  You'll have something more akin to a Tennessee fainting goat than a canary.  It will fall over often but you'll quickly learn to ignore it.  If you see a failure and say "That's okay, we can continue on anyway," that test shouldn't be in the BVT.  The last thing you want is to become numb to failure.  Put only those tests into your BVT that indicate critical failures in the system.  Everything else should be in a separate test pass run after the BVTs pass.

It is imperative to keep your BVTs clean. By that, I mean that the expected behavior should be for every test to pass.  It is not okay to have a certain number of known failures.  Why?  Because there is no clear indication of a critical failure.  "I can't recall, do we usually have 34 or 35 failures?"  There are two things to consider in keeping the BVTs clean.  First, are the tests stable?  Second, are the features the tests cover complete?  If the answer to either of these is no, they shouldn't be in the BVTs.

When I say tests should be stable, I mean that their outcome is deterministic and that it is always a pass unless something actionable goes wrong.  Instability in tests can come from poorly written tests or poorly implemented features.  If the tests behave in a seemingly nondeterministic manner, they shouldn't be in your BVT.  You'll be constantly investigating false failures.  Fix the tests before enabling them.  If a feature is flaky, you shouldn't be testing it in the BVT.  It is not stable enough for the project to be relying on it.  File bugs and make sure that developers get on the issues.

BVT tests should only cover aspects of features that are complete.  It is tempting to write tests to the spec and then check them all in even before the feature is spec compliant.  This causes a situation where there are known failures.  As above, this is a sure way to overlook the important failures.  Instead, you should only enable tests after they are passing and where you don't expect that behavior to regress.  If the feature is still in a state of constant flux, it shouldn't be in the BVTs.  You'll end up with expected failures.  BVT tests should reflect what *is* working in the system, not what *should be* working.


  1. Nice article and I learned something new...
    I learned there are Tennessee fainting goats and what they are !
    ( great analogy BTW )

  2. One of the metrics we use for our BVT's is that they should take no more than 15 minutes to run (10 is better). Since we use a continuous integration processes, this limit allows the developer to run the test before checking in new code, and near imeadiate feed back fromt he build system if somthing goes wrong.

  3. on a different topic:
    Would you kindly shed some light on the career path an SDET might take (did it change since your famous post in 2005), where would SDET end up in 2 years and let’s say 5 years? Any feedback the future of SDET outside of Microsoft? Would the SDET learn technologies similar to a DEV, for instance, a DEV may develop a tool for WE and gain great experience, While the SDETs are writing the same type of code over and over again that just calls different methods.

  4. Hey Steve,
    Great post, and I will link to it on my post about the subject if you don't mind.
    I think your point "The last thing you want is to become numb to failure" is an excellent point, and something that anyone engaged in automated testing or analysis should keep in mind.
    For example, we know that if static analysis tools are throwing too many red flags, the devs become anesthetized and begin to disregard the potential errors as red-herrings.
    - Bj -

  5. @Jason, that's a great suggestion that almost made it into the article.  Make sure the BVTs run quickly.  10-15 minutes is a good target.
    @Anonymous, that is a good question which I'll try to cover in a post soon.

  6. @IM Testy - go ahead and link.  It is definitely an important point and one which is often overlooked.  It's very important to have a bright line between failure and success.

  7. Steve Rowe joins my blogroll due to a nice post about Keeping your BVTs Clean.

  8. I was always under the notion that BVTs were supposed to be a small set of tests to ensure the build output was not corrupted and the product was in a testable state. I feel that BVTs are pointless if they don’t do this - since a rebuild must happen - and that general feeling comes nearer the beginning of the product cycle when one is trying to figure out how to get setup right or put new components in, and again near the end, when all the pieces are finally coming together.

  9. Leah, what you describe is sometimes called a BAT or build authentication test.  Other times it is called a BVT.  Either way, it's a little different than what I'm describing.  The purpose of what I'm describing is to determine if major functionality has broken.  This is a step up the ladder from just "is the build corrupt"?  Both sorts of test suites are useful.  Often a build must pass BATs before being released to BVTs.

  10. BVTs or Build Verification Tests are standard Microsoft parlance for the tests we run every day to ensure