Monday, June 23, 2014

Perceived vs. Objective Quality

I recently heard this story, but I can't recall who told it to me. I don’t have proof of its veracity so it might be apocryphal. Nevertheless, it illustrates an important point that I believe to be true independent of the truth of this story.

As the story goes, in the late 1990s, several Microsoft researchers set about trying to understand the quality of various operating system codebases. Of concern were Linux, Solaris, and Windows NT. The perception among the IT crowd was that Solaris and Linux were of high quality and Windows NT was not. These researchers wanted to test that objectively and understand why NT would be considered worse.

They used many objective measures of code quality to assess the 3 operating systems. This would be things like cyclomatic complextity, depth of inheritance, static analysis tools such as lint, and measurements of coupling. Without debating the exact value of this sort of approach, there are reasons to believe these sort of measurements are at least loosely correlated with defect density and code quality.

What the researchers found was the Solaris came out on top. It was the highest quality. This matched the common sense. Next up they found was Windows NT. It was closely behind Solaris. The surprise was Linux. It was far behind both of the other two. Why then the sense that it was high quality? The perceived quality of both NT and Linux did not match their objective measures of quality.

The speculation on the part of the researchers was that while Linux had a lot of rough edges, the most used paths were well polished. The primary scenarios were close to 100% whereas the others were only at, say, 60%. NT, on the other hand, was at 80 or 90% everywhere. This made for high objective quality, but not high experienced quality.

Think of it. If everything you do is 90% right, you will run into small problems all the time. On the other hand, if you stay within the expected lanes on something like Linux, you will rarely experience issues.

This coincides well with the definition of quality being about fitness for a function. For the functions it was being used for, Linux was very fit. NT supported a wider variety of functions, but was less fit for each of them and thus perceived as being of lower quality.

The moral of the tale: Quality is not the absence of defects. Quality is the absence of the right kinds of defects. The way to achieve higher quality is not to scour the code for every possible defect. That may even have a negative effect on quality due to randomization. Instead, it is better to understand the user patterns and ensure that those are free of bugs. Data Driven Quality gives the team a chance to understand both these use patterns and what bugs are impeding them.

1 comment:

  1. Quality can be simply defined as "fitness for use". When dealing with an OS or any other complex software its quality is judged based on the used functionality and the number of issues discovered during usage. If the typical user uses 10-20% of the functionality provided by such software, it doesn't really help the user if the other 80-90% have a small volume of issues as long the used functionality is perceived as being not fit for usage. As you pointed out, focusing on the area most used and fixing the issues existing there can help improving the general quality perception. Though this needs to be a starting point in the effort of having a highly quality product. Quality is not a one time job but a continuous effort.
    There is no "right kind of defects". I think I understood what you meant, though I can't agree with this formulation - it sounds like a definition for quality, but an unfortunate/incorrect one. Quality can be corroborated with the absence of defects that have a high/considerable impact on users' perception.

    ReplyDelete