Tuesday, February 26, 2008

When to Test Manually and When to Automate

There's a balancing act in testing between automation and manual testing.  Over my time at Microsoft I've seen the pendulum swing back and forth between extensive manual testing and almost complete automation.  As I've written before, the best answer lies somewhere in the middle.  The question then becomes how to decide what to automate and what to test manually.  Before answering that question, a quick diversion into the advantages of each model will be useful. 

Manual testing is the most flexible.  Test case development is very cheap.  While skilled professionals will find more, a baseline of testing can be done with very little skill.  Verification of a bug is often instantaneous.  In the hands of a professional tester, a product will give up its bugs quickly.  It takes very little time to try dragging a window in a particular way or entering values into an input box.  This has the additional advantage of making the testing more timely.  There is little delay between code being ready to test and the tests being run.  The difficulty with manual testing is its cost over time.  Each iteration requires human time and humans are quite costly.  The cumulative costs over time can be very, very large.  If the test team is capable of testing version 1.0 in the development time but nothing is automated, it will take the test team all of the 1.0 testing time plus time for the new 2.0 features to release version 2.0.  Version 3.0 will cost 3x as much to test as the first version, and so on.  The cost increases are unsustainable.

On the opposite end of the spectrum is the automated test.  Development of automated tests is expensive.  It requires a skilled programmer some number of hours for each test case.  Verification of the bug can require substantial investment.  The up front costs are high.  The difficulty of development means that there is a measurable lag between code being ready for test and the tests being ready to run.  The advantage comes in the repeated nature of the testing.  With a good test system, running the tests becomes nearly free.

With those advantages and disadvantages in mind, a decision framework becomes obvious.  If testing only needs to happen a small number of times, it should be done manually.  If it needs to be run regularly--daily or even every milestone--it should be automated.  A rule of thumb might be that if there is a need for a test to be run more than twice during a product cycle, it should be automated.  Because of the delay in test development, most features should be tested manually once before writing the test automation for the feature.  This is for two reasons.  First, manual exploratory testing will almost always be more thorough.  The cost of test development ensures this.  Second, it is more timely.  Finding bugs right away while development still has the code in their minds is best.  Do thorough exploratory testing of each feature immediately.  Afterwards, automate the major cases.

This means that some tests will be run up front and never again.  That is acceptable.  If the right automated tests are chosen, they will act as canaries and detect when things go wrong later.  It is also inevitable.  Automating everything is too costly.  The project won't wait for all that testing to be written.  Those who say they automate everything are likely fooling themselves.  There are a lot of cases that they never write and thus are never run.

Monday, February 25, 2008

Classes Should Exhibit High Cohesion

This is part 4 of my ongoing Design Principles series.

When designing the classes in your model, it is important that they each have a specific role to play.  Cohesion means joining together to form a united whole.  Classes and methods should be highly cohesive.  They should have a single function.  Methods on a class should be single function and together the methods on a class should all work toward accomplishing the same purpose.  Adding an add method to a string class is not being cohesive.  Having a "utility" class isn't either.  Classes (or methods) that try to play more than one role are harder to use, harder to test, and harder to change.  The reason is side effects.  If a class does many things, there is a high likelihood that using one part will affect how another part works. 

It is hard to understand the use model of a multifunctional class.  A class with a purpose has an obvious place in a design.  A class without a purpose implies less of its use pattern and is therefore easier to mis-use.  Classes with multiple roles to play also have more complex interfaces.  Each role subtly influences the design of the other roles and may make the use model more complex than necessary.  Simple, self-documenting use models are going to result in the cleanest code with the fewest bugs.

It is important in testing to make sure that one method call on a class doesn't impact other method calls.  If the number of methods on a class gets large, the combinatorial explosion forced on test becomes unmanageable.  Likewise, every time a change is made, all methods on a class must be tested to make sure that they didn't break.  The more methods--and the more disparate the methods--the more burdensome this becomes for test.  That means longer test passes, less new exploratory testing, and a poorer product. 

Most importantly, complex multi-use classes will be difficult to change.  The code underneath will be interconnected and a change that should affect only one role may end up triggering a cascade of changes that impact all of the other roles.  Tradeoffs may be required.  What should be a simple change may mean untangling various pieces of internal state.  It is hard to be comfortable with a change to a complex class.  Unit tests will ease the discomfort, but there is so much that could potentially break that one can never be sure something subtle didn't just change for the worse.

How can a poor design be recognized?  Here are some questions to consider:

  • Is the name generic?  If it doesn't imply a single role, the class may be trying to do too much.
  • Conversely, can this class be given a simple name?  If naming is difficult, the class is too complex.
  • Classes with a large number of methods often are trying to do more than one thing.  Consider breaking them up.
  • Classes should have only one reason to change.
  • Methods that take a lot of options are generally poorly designed.  If a method checks a state and then branches right at the beginning, it should be more than one method.

DoSomething(enum Type, int Data)
           type1: do something
           type2: do something else

Is better written:

DoSomethingType1(int Data)
    do something

DoSomethingType2(int Data)
    do something else

Sunday, February 24, 2008

Podcasts I Listen To Regularly

It's been a while since I posted my list of podcasts and my tastes have changed since then.  Here's what I'm listening to on a regular basis right now:




There are a good number of others that I listen to with less regularity.

Tuesday, February 19, 2008

Is There Value In Code Uniformity?

There seem to be two opposing views of coding standards.  Some think that they should enforce uniformity in code.  Everyone should use K&R braces, leave two lines between functions, and have a space after the if and before the opening parenthesis.  Others (like me) think that there is little value in uniform-looking code.  The value of coding standards is in their ability to stop bugs from getting into the code.  Patterns that are error-prone, like failing to create a new scope { } for the result of an if statement, should be banned.  Others that are odd but not necessarily error-prone should not.  I'm interested in hearing from you, my readers, on your experience here.  Is there more value in having uniform code than I give it credit for?  Does it really make reading code that much easier?  Do bugs really stand out more or is it just format deviation that stands out?


2/20 - Ugh, my spam filter stopped every response to this.  Unblocked now.

Sunday, February 17, 2008

HD-DVD: 2006 - 2008, R.I.P.

According to reports, Toshiba will soon stop manufacturing HD-DVD equipment.  The writing has been on the wall since Time-Warner's January announcement that it would go exclusively to the BluRay format.  More recently Netflix and WalMart have announced that they will go exclusively to the BluRay format.  It's unfortunate because, in my mind, HD-DVD was the superior format.  It doesn't have all of the profile issues that are currently plaguing BluRay.  Alas, the market has chosen a winner.  The format war will soon be over.  The question is, will consumers notice?  Will they embrace the new format?  Or will they continue to ignore the expensive high-def disc formats.

[2/19/08] Update:  It's official.  Toshiba announced that they will stop development and manufacturing of HD-DVD equipment.


Note:  I'm not on the HD-DVD team here at Microsoft and have no inside knowledge.  This is just my take based on public news sources.  Don't take this as confirmation of anything.

Thursday, February 14, 2008

Evaluating Your Skill As A Leader

Someone recently characterized for me one way leaders are evaluated.  This certainly isn't the only way and it doesn't catch everything, but it is a good place to start.  The list is succinct and the questions thought-provoking.  Here is the list:

    • Results – How is your day job going? 
    • Leadership – Who do you make better?
    • Strategic Insight – How far out are you thinking?  A few days?  A few months?  Several years?
    • Scope – Who is changing because of you?  Your team?  Your group?  The company?  The industry?
    • Sphere of Influence – Who is listening to you?

These are good questions to keep in mind when looking to advance one's career into leadership.

Monday, February 11, 2008

Arc Is Out

Over 6 years ago Paul Graham told the world that he was working on a new programming language called Arc.  It would be some derivative of Lisp, but otherwise not much was known about it.  Graham is the author of 2 books on Lisp and a popular series of essays on topics ranging from programming to startups.  As he had been mostly quiet about Arc for such a long period of time, many people have forgotten about it or dismissed that it would ever come to light.  Two weeks ago Paul announced the initial version of Arc and made it public.  I haven't had a chance to look at it yet but I intend to.  If you are interested, you can find information at arclanguage.org.  There seems to be some excitement building around it.  Only time will tell if it can build a user base or if it fades like so many other new languages.

Thursday, February 7, 2008

Modularization vs. Integration - Which Is Best?

Clayton Christensen's second book, The Innovator's Solution, produces several important theories in the realm of innovation.  Like his first book, The Innovator's Dilemma, the second book should be required reading for anyone in technology and especially managers of technology.  Among the theories, one stands out as the most important and, I think, most applicable to the world of software development.  Christensen calls this the Law of Conservation of Attractive Profits.  In essence it states that the profits in a system move over time.  When a market segment is under-served, profits are made in vertically integrated products.  When a market becomes over-served, the profits instead flow to more modular solutions.  In this post I will lay out the theory.  In a future post, I'll apply it to software development.

For every market segment--whether PCs or winter jackets--there are certain metrics that matter.  In the world of PCs for a long time it was speed.  People bought one computer over another because it was faster.  In an early market, the products do not sufficiently satisfy the demand for that metric.  Computers were too slow.  In these markets, there is a performance gap.  To make the fastest computer required tightly integrating the hardware, the operating system, and often the application software.  The interfaces between each of the parts had to be flexible so they could evolve quickly.  This meant the parrts were proprietary  and interdependent.  Companies trying to work on only a single part of the problem found that they couldn't move fast enough.  Look at the world of computer workstations.  When WindowsNT first tried to take on the Sun and HP workstations of the world, it wasn't as fast.  Intel made the processors, Dell made PCs, Microsoft made the operating system.  By comparison, Sun made the Sparc processor, the Solaris operating system, and the Sparcstation.  It was difficult to squeeze as much speed out of an NT system as Sun could get out of its.  Because Sun's workstations provided more performance where the market wanted it, Sun was able to extract higher "rents" (economist-speak for profits).

Eventually every market's primary metric is sufficiently supplied by available solutions.  Products can be said to have a performance surplus.  At this point, customers no longer make purchasing decisions based on the metric--speed--because most solutions provide enough.  Customers are willing to accept higher performance, but they aren't willing to pay for it.  Instead, their purchasing criteria switches to metrics like price, customization and functionality.  Modular architectures trade off performance for the ability to introduce new products more quickly, lower costs, etc.  Products become more commoditized and it is hard to extract high rents for high performance.  However, Christensen says that the profits don't disappear, they only shift to another location in the value chain.  Those companies who are able to best provide the market's new metrics will make the most money.  In the example of the workstations, once computers became fast enough, the modular solutions based around WindowsNT began to make a lot more of the money.  The costs for these were lower, the ability to customize greater, and the support ecosystem (3rd party devices and software) larger.

Looking closely, it becomes apparent that markets are a lot like fractals.  No matter how close the zoom, there is still a complex world.  Each of the modular parts are themselves a market segment with their own primary metrics.  Each one is subject to the modularization/integration cycle.  When a system becomes ripe for modularization, the profits move to the integrated modules which best satisfy the new metrics.  The secret to continuing to gain attractive profits is to notice when this transition is taking place and give up vertical integration at the right moment, choosing instead to integrate across the parts of the value chain least able to satisfy customer demand.

This theory seems to explain Apple's success with the iPod.  The Plays-For-Sure approach taken by Microsoft was a modular approach.  Vendors like creative supplied the hardware.  Microsoft supplied the DRM and the player.  Companies like Napster supplied the music.  There are 3 tiers and 2 seems between them that must be standardized.  In an emerging market where the technology on all fronts was not good enough, is it any wonder that this approach was beaten by an integrated strategy?  Of course, hindsight is 20-20 and what is obvious to us now may not have been obvious then.  Still, Apple came at the problem controlling all 3 pieces of the puzzle.  It was able to satisfy the metric--ease of use--much better than the competition.  We all know how that turned out.  The theory indicates that at some point the metric will be satisfied well enough and people the advantage of the integrated player will dissipate.  With the move away from DRM'd music and the increase quality of the available hardware, this day may be upon us.  Amazon's MP3 store seems to be gaining traction.  Competitors like the Zune and the Sansa players are making some inroads in the player space.  A dis-integrated model may be emerging.

Tuesday, February 5, 2008

Vista Audio 1 Year Later - Interview with Cakewalk

One of my readers tipped me off to an interesting interview with Noel Borthwick, CTO of Cakewalk.  He talks about the improvements made in Vista for lower-latency audio via the WaveRT driver model it introduced.  To date, support for this mode among non-motherboard-audio has been slow to develop.  Most motherboard audio parts support it.  Those who are using it with the supporting parts are finding success.  He also talks about MMCSS which is the mechanism by which Windows gives priority to audio applications to avoid glitching, midi, and general Vista issues (UAC, graphics drivers, etc.).  Some thing are going well, others are still pain points.  An interesting read.

Monday, February 4, 2008

Project BBQ Reports Are Released

Project BBQ is the premier interactive music industry think tank.  Everything from PC-audio to game audio to audio creation is covered.  I attended this past October and the final reports are finally released.  This year's subjects included suggestions for improving the base level of PC audio, audio metadata, a game producer's guide for audio, new composition input devices, and many more.  You can find them all here

I was part of the "Fixing Broken PC Audio" group also known as Mr. Miagi's Little Trees (strangely, there is no story there--sorry).  We began with the premise that Vista and the new Windows Logo requirements along with it had improved the minimum quality bar for PC audio but that there were still many problems to solve.  We ranked the problems we observed and detailed the state of the industry and suggested solutions for each problem.  We covered everything from the challenges of working with HDMI and Bluetooth audio to latency with a lot in between.  Check out the report for details.

There is a forum thread over on the O'Reilly digital media site discussing the reports.

There are also pictures from the event.  You can see me in the middle here.  In these two we're working on our report.