Ruminations on Computing: April 2005

Thursday, April 28, 2005

Black Box Testing

I attended a talk on campus yesterday discussing various aspects of testing. Part of the talk discussed the need for testers to become better versed in the formalities of testing. I'll leave that subject for another day. A portion of the talk, however, discussed an experiment done with some inexperienced testers. They were asked to create test cases for the Myers Triangle Test. A lot of the test cases they came up with were not useful. By that I mean they didn't test the algorithm or they were redundant with other tests. Some would try inputting something like an 'A' which is invalid and won't pass the string->int conversion function or they would try lots of different numbers that all went down the same code path. If you look at the underlying code, it is obvious why these tests don't make sense. Too often though, test plans are full of cases like these. Why is that?

I contend that we often test things only at the surface level and don't consider the implementation. At some point in time I was told that black box testing was a good idea because if you looked at the underlying code, you might make the same flawed assumptions that the developer made. This is probably also where we got the notion that you shouldn't test your own code. I never really agreed with the concept of purposeful black box testing but didn't fully challenge the assumption in my mind. After some reflection though, I am pretty sure that black box testing is almost always less useful than white box testing.

Just in case you don't follow, let me define some terms. Black box testing is testing where you don't understand the implementation details of the item you are testing. It is a black box. You put in data, you get out different data, how it tranforms the data is unknown. White box testing is testing where you have the source code available (and look at it). You can see that there are 3 distinct comparisons going on in the Meyers Triangle Test.

Black box testing can be useful if we don't have the time or the ability to understand what we are testing but if we do, it is always better to take advantage of it. Without knowing the details, I have to try every potential input to a program to verify that all of the outputs are correct. If I know the inputs, however, I can just test each code path. If all triangles are tested for A == B == C, I don't need to Triangle(3,3,3) and Triangle (4,4,4) and Triangle(5,5,5). After the first one, I'm not trying anything new. Without looking at the code, however, I don't know that. Not only does white box testing allow you to see where you don't need to test, it lets you see where you do. Some years ago I was testing our DirectShow DVD Navigator software. There is a function for fast forward that takes a floating point number. From a black box perspective, one would have no idea what numbers to pass. Just try some and call it good. In this particular implementation, however, there were different behaviors depending on which number you put in. For a certain range of numbers, all frames were decoded and just played quickly. For a higher range, only I-frames were played. For everything above that range, the navigator started playing only some of the I-frames. Without looking at the code, I could not have known which test cases were interesting. I couldn't guarantee that I tried something from every range.

What about making wrong assumptions if you look at the code? Won't that cause you to miss things? Perhaps. However, test driven development, unit testing, etc. have proven that testing done by the developer is quite effective. Testers should also have a spec outlining what proper behavior should be. If the code deviates from that spec, you found a bug (somewhere--it might in be the spec). If you use common sense, you are unlikely to miss a bug because you make the same assumption as the developer. If you do, the trade-off for greater test efficiency is probably worth it. You'll have found many new bugs for each one you miss.

Friday, April 22, 2005

Programming Languages

Interesting article this morning on CNet. It discusses language usage. Bjarne Stroustrup (the creator of C++) claims that C++ is not being overtaken by languages like C# and Java. He claims it is a matter of marketing. Java has marketing, C# has marketing, C++ does not. He's probably right about that. There is no hype machine for C++. Sometimes though, hype influences reality. If I'm an aspiring programmer, I'll hear all this cool stuff about .Net, Java, or C#. I will then be more likely to gravitate toward them. Surveys show that, as a percentage of all programming, C++ has dropped from 76% in 1998 to 46% in 2004. That may say more about the size of the programming pool though than about the number of C++ programmers. Stroustrup maintains that there are 3 million C++ programmers.

One aspect the article touches on is teaching languages. When I went to school, pascal was the language of teaching. In a lot of places, it is currently Java. Why do universities instist on "teaching" languages? Why not teach what people really use? You can't argue that C++ is too complicated. It is more complicated than C# or Java, sure. But then again, a lot of your graduates are going to be programming in C++. By the above survey, about half of them will. If they can't handle it in school, how are they going to handle it in the real world? I am all for taking some classes to expose people to different programming paradigms (C++, Lisp, Smalltalk, C#/Java, Python, C, etc.) but the core of the curriculum should be based around one language. That should be a language that is useful everywhere. It makes no sense teaching an OS class using Java. No one but Sun uses it for OS's and we know how successful JavaOS was.

My final thought on the subject is that it shouldn't suprise anyone that Java, VB, and C# and even "languages" like PHP or Perl are gaining so much momentum. They are more accessible than C/C++. They are easier to learn and easier to use. It's harder to shoot yourself in the foot with VB than with C++. It's also harder to do many kinds of work. Why not start students with the more complex languages and work down from there. To learn C# from C++ is easy. To learn VB after you know C++ is easy. The other direction, not so much.

So, if you happen to be reading this and just starting out, consider learning C++ early. You'll be doing yourself a favor.

Wednesday, April 20, 2005

It's not the idea

I had a chance this afternoon to see one of my favorite writers and thinks, George Gilder. He came to Microsoft to speak. Anyway, he said something very interesting. He stated than patents are not all that valuable because they are open. Usually having the idea is not worth much until someone can reduce it to practice. An example he used was that of the microprocessor. This was something many people had the idea of doing. It wasn't terribly useful to have that idea until someone figured out to actually manufacture such a beast. Once that happened, the idea became valuable. This knowledge about how to do something is what he calls a "latent." This is an interesting idea and it has, I think, two implications:

Companies which are obsessed by patents may be going down the wrong path. The next Microsoft or Intel won't come from having the right patents but rather having the right latents.
The USPTO should look very critically at "idea" patents. If something is an actual mechanism of creating something (a latent which is being made open), a patent may be warranted. If, on the other hand, this is an idea before its time, it should be rejected. The only purpose the latter can have is to stop someone who has an idea about *how* to do it from doing it. Imagine someone patenting the idea of the microprocessor. It would have made it impossible for companies like Intel to have done what they did.

George Gilder works at the Discovery Institute. His newest book is called the Silicon Eye.

Friday, April 15, 2005

Unix vs Windows

Over the past few months I have had the opportunity to take an Operating Systems class at a leading University. During that time, I have been once again confronted with the whole Unix (*nix, Linux, Mac OSX) versus Windows argument. It became quite apparent to me during that time that the professor thought that Unix was better than Windows. It also became apparent to me that he didn't really understand Windows. Take, for example, our discussion of filesystems. When referrring to Unix, he talked about inodes. When referring to Windows, he talked about FAT32. When talking about memory management, he wasn't sure if Windows still used segments. When discussing UI programming, he showed some code with the most basic Windows application (create the window, run the message pump, handle windows messages). The problem was, he saw the message pump and didn't know what the code was doing. If you don't understand that, you really don't have a right to be criticizing Windows. The flip side is also true. If you don't understand Unix, you shouldn't be criticizing it either.

I have come to the conclusion that most "Windows is better" or "Unix is better" arguments at the wholistic level come down to familiarity. I have had discussions with a friend who is a BSD guy. Most of his criticisms are outdated or just plain wrong. I suspect the same is true to for my criticisms of Unix. For a given task, one may be better than the other. Taken as a whole though, both can obviously get the task done if you are familiar with them. The difficulty is that each is hard to understand so mastering both is really hard. Also, neither is standing still. People still argue that Windows crashes a lot (it did in Windows95 but not in WindowsXP) or that Linux can run on a 486 (it can, as long as you don't use a modern UI). These arguments are outdated. Often the criticism "Windows can't do X" or "Unix is hard to make do Y" boil down to "I don't know how to do X on Windows" or the corollary on Unix. It is an argument of familiarity rather than actual ease of use.

When I started the class, I hadn't done much Unix work for a long time. It was painful trying to compile, debug, etc. I still think that things like Windbg are more advanced than DDD/GDB but it is less painful now than it was. Likewise, I've come to understand some of the Unix file structure and command line tools like vim and grep which makes getting around less difficult. As I become familiar, it is harder to think "Unix Sucks." I can still point to many places where it does, or where it feels like it does, but I have to be careful that these aren't stemming from merely my ignorance. I would challenge any Linux/BSD/Mac OSX/Unix/Solaris afficianados out there to do likewise.

Monday, April 11, 2005

Western Digital Understands Warranties

I have a hard drive which seems to be going bad on me. It's a Western Digital SATA drive. So far, no data loss but the drive has a tendency to disappear after my computer has been on for a long time. It also tends to start making a clicking noise during the BIOS RAID screen during a warm reboot. Needless to say, a clicking hard drive that just stops underneath the OS is not something I want vital data on for long. The drive is still under warranty so I head to the Western Digital web site to figure out how I need to handle the warranty. Right off the support page there is a link to check on the warranty status of a drive. Just type in the serial number and it will tell you when the warranty expires. Slick. I type that in and find out I have about 9 months left. Good. Now, to get an RMA to return this. Often this is like pulling teeth. Call someone, send an e-mail to customer service and just hope they pay attention, post to a web forum and wait for a response, these are all options I have had to do in the past. Not for Western Digital. Right on the check warranty results page is a box to type in the reason for the return. They don't seem to really care because they only give you 30 characters to type. After that, you pick a few options about how you want your return handled and you are done. Shortly you are notified with an automated e-mail to your account with the RMA number and tracking information. They'll even ship the new drive before you return the old one. How cool is that? I've dealt with my share of warranties over the years but this one is by far the easiest and best to date. They don't try to hold a profit margin up by making warranty service hard. The drive only costs $70 these days so their entire profit margin would be eaten up in just one phone call anyway. Instead, they just let you return it. No hassle. Kudos to Western Digital for truly making the customer king.