Friday, May 30, 2008

We Need A Better Way To Test

Testing started simply.  Developers would run their code after they wrote it to make sure it worked.  When teams became larger and code more complex, it became apparent that developers could spend more time coding if they left much of the testing to someone else.  People could specialize on developing or testing.  Most testers in the early stages of the profession were manual testers.  They played with the user interface and made sure the right things happened.


This works fine for the first release but after several releases, it becomes very expensive.  Each release has to test not only the new features in the product but also all of the features put into every other version of the product.  What took 5 testers for version 1 takes 20 testers for version 4.  The situation just gets worse as the product ages.  The solution is test automation.  Take the work people do over and over again and let the computer do that work.  There is a limit to the utility of this, but I've spoken of that elsewhere and it doesn't need to be repeated here.  With sufficiently skilled testers, most products can be tested in an automated fashion.  Once a test is automated, the cost of running it every week or even every day becomes negligible. 


As computer programs became more complex over time, the old testing paradigm didn't scale and a new paradigm--automated testing--had to be found.  There is, I think, a new paradigm shift coming.  Most test automation today is the work of skilled artisans.  Programmers examine the interfaces of the product they are testing and craft code to exercise it in interesting and meaningful ways.  Depending on the type of code being worked on, a workforce of 1:1 testers to devs can usually keep up.  This was true at one point.  Today it is only somewhat true and tomorrow it will be even less true.  Some day, it will be false.  What has changed?  Developers are leveraging better programming models such as object-oriented code, larger code libraries, greater code re-use, and more efficient languages to get more done with less code.  Unfortunately, this merely increases the surface area for testers to have to cover.  Imagine, if you will, a circle.  When a developer is able to create 1 unit of code (r=1), the perimeter which a tester must cover is only 3.14.  When the developer uses tools to increase his work and the radius stretches to 2, the tester must now cover a perimeter of 12.56.  The area needing to be tested increases much faster than the productivity increase.  Using the same programming models as the developers will not allow test to keep up.  In the circle example, a 2x boost in tester performance would only cover 1/2 of the circle.


Is test doomed?  Is there any way to keep up or are we destined to be outpaced by development and to need larger and larger teams of test developers just to keep pace.  The solution to the problem has the same roots as the solution to manual testing problem.  That is, it is time to leverage the computer to do more work on behalf of the tester.  It will soon be too expensive to hand-craft test cases for each function call and the set of parameters it entails.  Writing code one test case at a time just doesn't scale--even with newer tools.  In the near future, it will be important to leverage the computer to write test cases by itself.  Can this be done?  Work is already beginning, but it is just in its infancy.  The tools and practices that will make this a widespread practice likely do not exist today.  Certainly not in a readily consumed form.


This coming paradigm shift makes testing a very interesting place to be working today.  On the one hand, it can be easy for testers to become overwhelmed with the amount of work asked of them.  On the other hand, the solutions to the problem of how to leverage the computer to test itself are just now being created.  Being in on the ground floor of a new paradigm means the ability to have a tremendous impact on how things will be done for years to come.


 


Update:  There are a lot of people responding to this post who are unfamiliar with my other writing.  Without some context, it may seem that I'm saying that test automation is the solution to all testing problems and that if we're smart we can automate all of the generation.  That's not what I'm saying.  What I advocate in this post is only a powerful tool to be used along with all of the others in our toolbox.  If you want some context for my views, check out:



 

9 comments:

  1. Not sure I entirely agree with your post, but it's intriguing.
    James Bach, otoh, doesn't seem to like you at all (http://www.satisfice.com/blog/archives/128)

    ReplyDelete
  2. s/perimeter/area/ (your figures are correct for area, not for perimeter; also perimeter doesn't illustrate your point, as it grows as O(n) rather than O(n^2))

    ReplyDelete
  3. @Billy Bob.  Thanks.  I posted a response over at James Bach's site.  He's right that my history is a bit simplistic, but I think it represents the way testing actually rolled out across our industry.  He's wrong in his characterization of my position.  Automation is not the sole solution.  Neither is manual testing.  My stance on this is clear in this blog.
    @Maurits, I'm using 2*pi*R which is perimeter.  Area would be pi*R^2.  For R=1 or 2 the numbers turn out to be the same.  Area may be a better explanation of the runaway nature of the problem, but I was calculating perimeter.  It shows enough of the problem.

    ReplyDelete
  4. Yes, automated test generation is a new area of development - it's starting to show real promise. It's not nearly mature enough to depend on.
    And even when automated test generation is 100 years old, it won't be comprehensive enough to depend upon. The whole point of having intelligent testers is that they can think of things that a developer and computer can't.
    Here's a trivial example: no computer will be able to unplug itself halfway through a critical operation. It can simulate a shutdown, it can terminate the process, but it can't actually pull the power.
    A less extreme example is just the understanding of the business context. No computer-generated test will understand the horrible legal problems that result if you accidentally mark a customer as type "T" instead of type "P". A computer-generated test could ensure it's a string, but won't care about a value that isn't used elsewhere in the application.
    The broader version of this is something I argue about a lot with developers. There will always be inputs into your system that violate the assumptions made in building it. A computer generated test will not be able to violate such assumptions.
    My prediction: Computer-generated tests will end up being an addition to unit tests. They won't reduce a tester's workload. They won't even reduce a developer's workload - the developer will still have to write unit tests too. They'll just be a tool that helps provide shallow, broad coverage.
    Essentially, they'll provide the same level of assurance that static analysis tools provide, but in a slightly different area. They'll catch the obvious bad stuff, and so will be worth it. They'll have no idea about your particular application, just some particular code constructs.
    Another useful tool in the shed. But not a silver bullet for saving testers.

    ReplyDelete
  5. Steve,
    I appreciate the problem that you're bringing up - testing scalability in a software product scenario ... as product grows bigger, you would need *larger* testing effort to test increasing testing scope. But do not agree with your automation paradigm.
    >>> Is test doomed?  Is there any way to keep up or are we destined to be outpaced by development and to need larger and larger teams of test developers just to keep pace.
    How about adding intelligent Manual testers? Why testing needs to either coders or test developers? If your problem seems to be with scalability of manual testing with releases - what are your suggestions around manual testing (I agree with you that Automation can help in covering some portion of regression testing)?
    In my view, the concern that I have with your automation paradigm is that - it has "automation" "coders" in the center of the scheme not the testing (remember we are trying to solve a problem with testing).
    I would say the new paradigm would do well if we keep testing (human testing) at the center and then ask - here is a problem that I am trying to solve - how automation can help me here?
    In your response to a comment Billy Bob - you mentioned that "Automation is not the sole solution.  Neither is manual testing". This post of yours talk about "Automation" what about manual testing ... where is the balancing act?
    Shrini Kulkarni

    ReplyDelete
  6. Hi, Steve...
    I recently saw a presentation at STAR East from James Whittaker.  He made a set of statements that rather startled me.  He suggested that, thanks to the spelling checkers and grammar checkers in Microsoft Word, a developer who wasn't very good at writing could produce some that was good, or at least okay.
    I think what was missing from the presentation was discussion about <i>value</i>.  What do we value?--writing that is syntactically correct, or writing that says something meaningful and that contains as few things as possible that threaten its value?  Do our automated tests take into account the notion that different people might value different things, and that one of the tester's primary goals is to recognize different constituencies and the ways in which their values might be threatened?  No; not currently, and not in the foreseeable future.  We can even get people to read people's minds, for crying out loud; I'm not prepared to wait for someone to automate that process.
    As a parallel, take the writing of my friend and colleague James Bach, for example.  James regularly writes things that I agree with entirely.  He does so in a way that I appreciate but that others sometimes don't.  I think that this sometimes threatens the value of his writing.  As his colleague, and as tester, I mention this problem to him when I see it.  He takes my comments seriously, and he takes the threat to the value of his writing into account, and as his own program manager, he gets to make the call as to whether he should revise his words or not.  Sometimes he does, sometimes he doesn't.
    <i>But his spelling checker and grammar checkers don't say a thing about any of this</i>--and when the grammar checker <i>does</i> say something, it usually gets it wrong.
    I'll contend that the solution to the problems of bad software is <i>not</i> more test automation, any more than the solution to controversy about testing is more automated spelling or grammar checks.  These tools might help to reduce egregious spelling mistakes and spectacular grammatical problems, but they won't help us argue our points any better.
    ---Michael B.

    ReplyDelete
  7. Fair warning: I found this post through James Bach's blog (http://www.satisfice.com/blog/archives/128), as I suspect some of the other commenters did. He came down on you pretty hard, so thought I'd check out the source.
    I don't pretend to be able to write the history of testing (or of development for that matter) in a blog comment, so I'll keep mum on that part. What struck me, though, are two of the points you made: (1) that over time the testing burden increases as there are new features and the existing features have to be maintained; and (2) that automatic test case generation is a way to combat the growth in the number of needed tests.
    Regarding your first point, I absolutely agree: Software generally gets more complex as it ages. That holds true for development as well as test. One thing I don't think we talk about enough or address explicitly is retiring features and retiring tests. Sometimes it's not feasible, but more often than not I find we test features and old ways of doing things because they're there, when that feature may be unused and it's time to get rid of it, or at least say that it can't be used in certain ways in conjunction with new features. This sounds naive, but I think it's generally worth asking the question - I've had shocking luck with this one, particularly with features that are one version old. For some reason, some features just aren't picked up and used, and they can be quietly retired (with some comment about how it seemed like a good idea at the time, but...).
    Regarding your second point, I think you may have stepped into a small minefield. "Automated testing" is a loaded phrase, with zealots on both sides of the fence claiming everything from "automated testing isn't real testing. Testing is a thinking human activity that questions the system." to "100% automation is a good goal". I understood your point somewhat differently, though. If I understood you correctly (and please correct me if I'm wrong), you're not talking about writing automated tests. You're talking about finding a way to have automated tests generate themselves. This I find quite interesting. It's not a complete solution to all testing problems, but I can think of a lot of situations where a generated test would be useful, particularly at the unit test level, or in areas where the same problem has to be solved over and over.
    .....and as usual, once I start typing I realize this has become rather long. So I've elaborated on my own blog instead of filling yours! Thoughts continue at:  http://blog.abakas.com/2008/06/automated-automation.html

    ReplyDelete
  8. @Catherine, I think you understand my point correctly.  Regarding getting rid of old tests and features, sometimes that is possible, but sometimes it isn't.  Remember that I work on an operating system.  It's really hard to get rid of any features because there are always people using them.  We can't stop testing them because some change at a lower level might unexpectedly break them.  Even a small change in timing can have crazy effects.
    On the subject of automation, I'm talking about using tools to leverage a person's ability to write tests.  There are varying degrees to which a person can leverage the computer to generate tests.  The point is that it has to be more than one piece of code for one test.
    You are correct that it is not a complete solution.  It isn't and I never meant to imply that it was.  If you read the other posts on my blog you'll see that I don't advocate 100% automation and this enhanced form is no different.
    @other - I'm reading in reverse order.  I'll respond soon.

    ReplyDelete
  9. @Ben, I agree that automatically generated tests will not be 100% of what is necessary to qualify a product.  Certainly we'll still need a lot of sentient tests (to use James Bach's term).  There will always be a need for manual tests and for hand-crafted automated tests.  Automated test generators need not be unguided though.  They could be influenced by a human and then generate a lot of test cases based upon that person's input.  
    @Shrini, look at my other posts on the test topic to see where I think the balancing act is.  You are correct that there is one.  There is no silver bullet in testing.  You are right that not all manual testing is simplistic.  Some of it is, some of it isn't.  That is, imho, irrelevant to the main point which is that 1) the regression testing burden for a large product gets out of hand very quickly and 2) the surface area requiring testing is increasing at a very rapid rate.  This implies the need for automated tests and for leveraging the computer to increase the speed of the automation.  This will allow intelligent manual testers more time to poke around the interesting parts of the system.  You are correct that I'm not talking about how to improve manual testing.  That isn't the point of this particular post.
    @Michael, I agree that more testing will not make bad software good.  You can't test in quality.  It will, however, help to gauge the quality level of more software.  As for your statements about value, I totally agree.  Value is more than correctness.  I do not advocate using only some automated tool to do all of the testing.  It can be a part of the arsenal, but shouldn't be the whole thing.

    ReplyDelete