Friday, March 19, 2010

Resume Advice: List Your Classes and Projects

It is campus hiring season and I have been reading a lot of college resumes lately.  One thing I have noticed on many resumes is that they do not list what I consider to be some of the most relevant information.  As important as it is that someone made the Dean’s list or worked at Best Buy, it is even more important to know their experience that is directly relevant to the job.  Some have internships and these are directly relevant.  Others have not been so lucky.  That’s fine, but if you find yourself in this bucket, please, please list your classes and describe the major projects you have worked on.  Knowing what projects were worked on gives me, the interviewer, a much better sense of what you are capable of.  It also gives me more material to latch onto and ask questions about.  As a candidate, you want me asking questions about your projects because you should understand them well and be able to talk about them fluently.  Additionally, without classes and projects, a resume with jobs working at Home Depot or the local KFC looks just like every other resume.  It is your internships, projects, and possibly classes that set you apart.

Tuesday, March 16, 2010

Pass Rates Don’t Matter

It seems obvious that test pass rates are important.  The higher the pass rate, the better quality the product.  The lower the pass rate, the more known issues there are and the worse the quality of the product.  It then follows that teams should drive their pass rates to be high.  I’ve shipped many products where the exit criteria included some specified pass rate—usually 95% passing or higher.  For most of my career I agreed with that logic.  I was wrong.  I have come to understand that pass rates are irrelevant.  Pass rates don’t tell you the true state of the product.  It is important which bugs remain in the product, but pass rates don’t actually show this.

The typical argument for pass rates is that it represents the quality of the product.  This makes the assumption that the tests represent the ideal product.  If they all passed, the product would be error-free (or free enough).  Each case is then an important aspect of this ideal state and any deviation from 100% pass is a failure to achieve the ideal.  This isn’t true though.  How many times have you shipped a product with 100% passing tests?  Why?  You probably rationalized that certain failures were not important.  You were probably right.  Not every case represents this ideal state.  Consider a test that calls a COM API and checks the return result.  Assuming you pass in a bar argument and the return result is E_FAIL.  Is that a pass?  Perhaps.  A lot of testers would fail this because it wasn’t E_INVALIDARG.  Fair point.  It should be.  Would you stop the product from shipping because of this though?  Perhaps not.  The reality is that not all cases are important.  Not all cases represent whether the product is ready to ship or not.

Another argument is that 100% passing is a bright line that is easy to see.  Anything less is hard to see.  Did we have 871 or 872 passing tests yesterday?  If it was 871 and today is 871, are they the same 129 failures?  Determining this can be hard and it’s a good way to miss a bug.  It is easy to remember that everything passed yesterday and no bugs are hiding in the 0 failures.  I’ve made this argument.  It is true as far as it goes, but it only matters if we use humans to interpret the results.  Today we can use programs to analyze the failures automatically and to compare the results from today to those from yesterday.

As soon as the line is not 100% passing, rates do not matter.  There is no inherent difference in the quality of a product with 99% passing tests and the quality of a product with 80% passing tests.  “Really?“ you say.  “Isn’t there a difference of 18%?  That’s a lot of test cases.”  Yes, that is true.  But how valuable are those cases?  Imagine a test suite with 100 test cases, only one of which touches on some core functionality.  If that case fails, you have a 99% passing rate.  You also don’t have a product that should ship.  On the other hand, imagine a test suite for the same software with 1000 cases.  Imagine that the testers were much more zealous and coded 200 cases that intersected that one bug.  Perhaps it was in some activation code.  These two pass rates then represent the exact same situation.  The pass rate does not correlate with quality.  Likewise one could imagine a test case of 1000 cases where 200 were bugs in the tests.  That is an 80% pass rate and a shippable product.

The critical takeaway is that bugs matter, not tests.  Failing tests represent bugs, but not equally.  There is no way to determine, from a pass rate, how important the failures are.  Are they the “wrong return result” sort or the “your api won’t activate” sort?  You would hold the product for the 2nd, but not the first.  Tests pass/fail rates do not provide the critical context about what is failing and without the context, it cannot be known whether the product should ship or not.  Test cases are a means to an end.  They are not the end in themselves.  Test cases are merely a way to reveal the defects in a product.  After they do so, their utility is gone.  The defects (bugs) become the critical information.  Rather than worrying about pass rates, it is better to worry about how many critical bugs are left.  When all of the critical bugs are fixed, it is time to ship the product whether the pass rate is high or low.

All that being said, there is some utility in driving up pass rates.  Failing cases can mask real failures.  Much like code coverage, the absolute pass rate does not matter, but the act of driving the pass rate up can yield benefits.

Monday, March 15, 2010

The Complexity Hammer

I’ve been doing a lot of interviewing lately, especially of college students.  There is one tendency I see a that really separates those that are good from those who still have more learning to do.  This is the tendency of the good programmers to see elegant solutions to problems and the corollary that less skilled programmers solve every problem by adding more complexity.  Stated another way, the thing that separates the best programmers from the rest is what happens when they run into a serious issue.  In my observation, the best coders step back and look for a more elegant solution.  The less skilled coders assume that their approach is correct and that another piece of state or another special case is the best choice. 


Here is an example.  The problem has been changed to protect the innocent.  Often times I ask a question similar to the change-making problem.  That is, write a program to enumerate all of the ways to make change for a dollar.  A typical approach might look something like this.  Note, I didn’t actually compile this code so there could be typos.  If there are, I’m sure you’ll let me know.



void MakeChange()


{


  int moneyLeft = 100;


  for (int quarters = 0; quarters <= 4; quarters++)


  {


    if (quarters) moneyLeft –= 25;


    for (int dimes = 0; dimes <= 10; dimes++)


    {


      if (dimes) moneyLeft –=10;


      for (int nickles = 0; nickles <=20; nickles++)


      {


        (if nickles) moneyLeft –=5;


        for (int pennies = 0; pennies <= 100; pennies++)


        {


          if (pennies) moneyLeft—;


          if (0 == moneyLeft) print…;


        }


      }


    }


  }


}


I know what you are thinking, “That’s not the right way to solve this.”  And you would be correct.  However, I have seen a lot of people give basically this solution.  Their failure to solve it correctly the first time isn’t the point of this post.  Rather, it is their response to the problem.  If you haven’t spotted it yet, this will only work correctly for the first time down for loops.  After we get to zero, we never gain the money back from the pennies we spent during the last nickles iteration.  When I point this out, the solution is too often not to step back and re-examine the problem.  “Is there something wrong with this solution?”  Rather the typical reaction is to assume that the solution is mostly right and to tweak it.  One might think of this as a specific case of not asking the 5-Why’s.  The initial reaction is often just to reset moneyLeft at the top of the quarters loop.  When that doesn’t work, more variables are added.  The result solution looks something like this:



void MakeChange()


{


  int moneyLeft = 100;


  int moneyLeftQuarters = 100;


  int moneyLeftDimes = 100;


  int moneyLeftNickles = 100;


  for (int quarters = 0; quarters <= 4; quarters++)


  {


    moneyLeft = moneyLeftQuarters;


    if (quarters) moneyLeft –= 25;


    moneyLeftQuarters = moneyLeft;


    for (int dimes = 0; dimes <= 10; dimes++)


    {


      moneyLeft = moneyLeftDimes


      if(dimes) moneyLeft –=10;


      moneyLeftDimes = moneyLeft;


      for (int nickles = 0; nickles <=20; nickles++)


      {


        moneyLeft = moneyLeftNickles;


        if (nickles) moneyLeft –=5;


        moneyLeftNickles = moneyLeft;


        for (int pennies = 0; pennies <= 100; pennies++)


        {


          moneyLeft—;


          if (0 == moneyLeft) print…;


        }


      }


    }


  }


}


Not exactly an elegant solution and not one that is easy to get right.  There are a lot of subtle cases to think through.  Unfortunately, code like this, or code trying to be like this shows up on my white board too often.  In a simple problem such as this, it is possible to keep all of the cases in your head and get it right.  When the problem becomes larger, however, this is no longer the case.  The programmer with the above solution will fail.  Thus the solution above is not an acceptable answer even though is technically solves the problem.


If one takes a few moments to step back and re-examine the problem, it is easy enough to see that one doesn’t need to track the amount of money left.  It can be trivially calculated when necessary.  This is just a specific case of the principle that one should never keep state that can be calculated.  Tracking such state provides no benefit and offers the possibility that it will differ from the real state.  The solution might look like this:



void BetterMakeChange()


{


  for (int quarters = 0; quarters <= 4; quarters++)


  {


    for (int dimes = 0; dimes <= 10; dimes++)


    {


      for (int nickles = 0; nickles <=20; nickles++)


      {


        for (int pennies = 0; pennies <= 100; pennies++)


        {


          if (100 == (quarters*25 + dimes*10 + nickles*5 + pennies)) print…;


        }


      }


    }


  }


}


Much more elegant.  Fewer state variables and thus less to get wrong.  All of this stems from the idea that one need not track state.  There is no reason to keep a running total of the money.  It’s all readily available at any moment.  It is this key notion that one needs in order to come up with a much improved algorithm.  As long as the programmer doesn’t step back and question the need for tracking how much money has been used/how much is left, they will be stuck adding complexity on top of complexity.  This is a prescription for failure.  This is not an isolated case.  Now that I have noticed this tendency, I can often spot it in interviews or even in code reviews.  The moral of the story:  always look for the elegant solution first.  Can the problem be solved by eliminating something or by looking at the problem differently?  Only once you have eliminated these possibilities should you add more state.  Adding state isn’t always the wrong solution, but it can be a crutch to avoid deeper thinking.


A few notes:


The initial paragraph isn’t quiet accurate.  The best programmers often see the elegant solution up front and get themselves into such trouble much less often.


The final solution is not optimal.  I know this.  Optimizing it would not benefit the example.