Ruminations on Computing: April 2007

Sunday, April 29, 2007

Do CS Schools Encourage Sloppy Coding?

I'm currently pursuing a Masters degree in Computer Science. I'm a little over halfway through the program so I've had exposure to several different classes. To date I've not seen a class which required programming where I got a sense that the code I handed in was ever looked at by a TA or a professor. Asking around, this does not appear to be an atypical experience. Some schools have quality standards for the code written by their students but many do not. If it compiles and accomplishes the task, that is sufficient for a good grade.

I totally understand why this happens. The teacher/student ratio is low. Looking at code is difficult and time consuming. Often the professors want to let the students work in their own choice of language. In that case, trying to stay on top of Ruby, C#, Java, C++, Perl, Lisp, Haskell, etc. is probably too much to ask. This begs the question though of whether it is a good practice.

If you need code to do something simple (as most assignments do) and only the outward behavior matters, hacking it together will suffice. Why bother freeing memory or checking error conditions? Why bother commenting the code? Why even validate your inputs? These are all things that are absolute musts in the industry. If you write code that doesn't handle errors, leaks like a sieve, and is hard to maintain, you'll be looking for work quickly.

After 4 years of programming like this, one gets into some pretty bad habits without some checks on quality. Wouldn't it be better if we had something akin to code reviews for code which was handed in?

Friday, April 27, 2007

How We'll Avoid the Animated Cursor Bug Next Time

Michael Howard provides a very insightful look at how the animated cursor bug bypassed the numerous security measures in Vista. He dicusses the -GS flag, address space randomization, static analysis tools, and fuzz testing. He also talks about the steps we're taking to make sure it can't happen again. Besides giving details on the .ani attack, he also provides what turns out to be a good primer on Vista security measures. If you have an interest in security, go read this post.

Real Numbers In the Next-Gen DVD Battle

Finally some real numbers showing how the two sides compare. In the first quarter, BluRay sold 830,000 titles and HD-DVD sold 335,000 according to Home Media Magazine. Those numbers look pretty bleak for HD-DVD but they are both so small in the overall shiny video disc market as to be insignificant. For some perspective, these sales figures put the total DVD sales at around $1.5 billion for the quarter. Even at $30/disc for the HD formats, their sales are a tiny fraction of the total. The PS3 launch definitely gave BluRay a boost. Will the Wal-Mart deal even things up a bit? It's still too early to call a winner in this race. My money is still on downloads.

Thursday, April 26, 2007

Teaching Your Team To Program

The world of testing is becoming a lot more technical than it once was. While there is still a need for strong exploratory testing, the need for test automation is increasing dramatically. Test automation requires programming at some level. Good test automation requires skilled programmers. Thus the need for test developers is increasing. As I've said before, learning to program is hard. As a manager of a team of non-programmers (or semi-literate programmers), is there a way to train them to become what you need? Is it better to just go hire programmers and teach them to test? As I've said elsewhere, I think it is easier to teach a programmer to test than to teach a tester to program. This post, however, is about doing just that: developing programmers out of an intact test team. Here is my advice on how to do that.

First, don't expect everyone to make the transition. For some it will be impossible. For others, there will be little interest. Remember, you still need manual testing. You don't want your whole team to convert. Look for the ones that show interest and aptitude. To find the ones who really show interest, you have to make it clear you don't expect everyone to move. You have to make it clear that this isn't a promotion. If getting promoted (or keeping your job) requires learning programming, everyone will want to.

Provide opportunity to learn. Expecting people to know how to program before you give them the opportunity to do so will guarantee failure. If you only give programming assignments to those who already know how, you won't develop any new programmers. Learning programming on your own time is something only a few will be able to accomplish. It is hard. Instead, find some tools you need written or features you need added to existing tools and offer the chance to work on them. See who grabs the opportunity and push them.

Give them time to learn. Beginning programmers won't be productive. It will take them 10x as long to do something as an experienced coder. That's okay. You have to budget for that time. The payoff will come later. It is important that management guard that time also. I've seen many times when management hands out an opportunity but then quickly piles up the non-programming work so there is no time to learn. If you ask someone to write a tool, make sure they have long, uninterrupted blocks of time to focus on it. As an upper-level manager, hold your leads accountable for providing this time.

Provide mentors. Books and blog posts only cover so much. There are a lot of little details and corner cases they don't discuss. New programmers need someone they can approach to ask questions. Getting stuck on a problem for 2-3 hours is healthy. Being stuck for 2-3 days is discouraging. If you have test-developers on your team, utilize them. If you don't, talk to the dev manager about using his people in this role.

Make it an assignment. Hold the aspiring programmers accountable for their output. Learning to program shouldn't be extra credit. When things get busy, extra credit falls by the wayside. Instead, make it part of their commitments for the period. Discuss the assignment during your 1:1s. If they don't get it done (whatever done means) by review time, hold it against them. It has to be treated like any other assignment. If they didn't get their test pass done, you would care. They need to know you'll care if they don't learn how to program.

Give real assignments. Don't ask them to write something you're going to throw away. Have them working on production code. Pick a simple tool. Better yet, have them augment an existing one. The assignments should also be in the same language as the application. Teaching them to program Visual Basic in Visual Test isn't training them to understand your product. Their utility will be limited if they can't interact with the developers. Finding mentors will also be harder.

Finally, give them an escape hatch. Programming isn't for everyone. If someone determines it isn't for them, let them go back to exploratory testing or whatever they were doing before. Forcing someone to learn something they don't want to is doomed.

Wednesday, April 25, 2007

Agile Testing Netcast

I tried a new netcast today. It's the initial netcast from Jason at Parlezuml. I believe the weekly show will be about agile development in general but this first one is all about the concept of agile testing. There are several interviews. The topics cover agile vs. waterfall testing, exploratory testing, and others. Some of the interviews were interesting. Others less so. It's a good freshman effort though. I'll be back next week.

Teaching Yourself To Program

Following up on my last post, I'll give some instruction for those testers who want to become programmers. As I said, last time, don't expect to learn everything you need to at work. The forces in play will almost certainly conspire to keep you from being able to. If you desire to expand your horizons and become a programmer (either staying in test or moving to dev), here are some ideas that can aid you in your journey.

Look for opportunities to program. Take advantage of them when they appear. There's always an extra tool that needs to be written. There's always a feature that would be really nice to have added to an existing tool. If you want to get support for your learning, this is your best chance. If you can do it and show the benefit to the team, you'll often be rewarded by more opportunities. The only way to get good at programming is to program. Practice every chance you get. I've seen plenty of aspiring developers ignore chances to develop tools at work. The ones that do rarely--if ever--make the leap.

Start small. When you are starting out, it is usually easier to add functionality to an existing program than to write something from scratch. This is because you have a structure to work with. When you are learning to program, you don't know how to put all the pieces together. You'll learn that, but starting out it can be hard. Working on something similar to an already-existing feature can be a good place to start.

Learn the language of your application. While Ruby, Python, or Visual Basic may be appealing and even easier to learn, unless they are the language the application you are testing is written in, you'd be best not to spend your time learning them--yet. Eventually, yes, but they won't really help you get where you want to go. There are two reasons for this. First, the most effective test automation is done in the native language of the application. Anything else doesn't align quite right and will make reaching the corners hard. Second, the people you can learn the most from--the application's developers--can only help you if you speak their language.

Ask a lot of questions. Potential candidates often ask what working at Microsoft is like. I invariably tell them that it is a very supportive environment. When you ask a question, you always get a detailed answer and usually a history lesson about why it is that way. I suspect that we're not alone in that respect. Most developers enjoy what they are doing and want to share that joy. If you ask them a question about how their program works or even about some project you are doing (that test tool you are writing) and the problem you are stuck on, they'll often be happy to help. Be careful not to take too much of a person's time and be especially careful not to ask the same question twice, but don't be afraid to ask. Part of learning on your own is asking questions of those more skilled than you. My rule of thumb is to spend a few hours trying to solve any problem. If you still don't have an answer, go ask someone.

Be prepared to put in the work. The job you were doing before you decided to go down this route didn't go away. You'll still need to do it. Learning to program isn't simple. It will take time. The combination is going to take a lot of time. Be prepared to show up early or stay late to work on your project.

Don't quit. There will be times when you get too busy to learn. There will be bugs that stump you for a long time. Don't lose track of what you are doing. Don't get out of the habit. Don't take on projects and then let them languish for months. You have to keep going.

Put in time outside of work. Read. Either online or on dead trees. A lot. Practice. A lot. Pick up a side project and work on that. Open source make a good place to start. Just working on something of your own does too. Work on programming a game or writing a tool. It's not even important that you finish so long as you are working. If you want to become a developer, it should be because you enjoy programming. Use this joy to your advantage. What you learn on the side will help make the stuff you do at work make more sense and go faster.

Monday, April 23, 2007

Fred Fish Dies

Those of you who owned an Amiga were probably aware of the Fred Fish disk collection. For those that weren't, this was a huge collection of shareware and freeware software for the Amiga. It was organized into a series of disks which became known as Fred Fish disks. It became the definitive way to reference shareware in that community. If you wanted, say, Directory Opus, you could find it on disk #412. Megaball (best Breakout/Archnoid clone out there) on disk 477. Anyway, the guy who created the collection was a programmer named Fred Fish who lived in Idaho. He died on April 20th. He did a lot for the world of computing back then. He'll be missed.

The contents of all the disks can be found here. There appear to have been 1,120 disks in all.

Friday, April 20, 2007

It's Difficult to Grow a Test Developer

A lot of testers begin life as software test engineers. That is, they execute tests but don't do any (or much) programming. The dream of many testers is to become a test developer or a developer. Reciprocally, the dream of many test managers is to grow their testers into test developers. Is this a realistic dream? It can be, but probably isn't in most cases.

It's very hard to become a self-taught developer. When we look out at the computer landscape we see plenty of self-taught programmers so it looks easy. However, for each one that succeeds, many more fail. Why is that? Two reasons I suspect. First is that some people are just not cut out to be programmers. Second, and perhaps more important, is that it is really hard. Becoming a good programmer* requires a lot of knowledge. That means a lot of reading (online or books) and a lot of practice. It turns out that it is a lot easier to desire to be a programmer than to put in the work to become one.

I covered the first point in my post entitled You Can't Teach Height. Studies show that a good number of people, even those interested in programming, cannot grok it. My suspicion is that this has to do with the abstract nature of programming. This isn't to say that they can't program at all but they can't program well and as the difficulty goes up, more and more drop off.

The second reason is the one that gets a lot of people. I've seen many try to make the leap and only a few succeed. Those that did had to put in a lot of work on their own time. Those that didn't often weren't willing to put in time outside of work. Anyone desiring to go from tester to test dev with just the time they spend on the job is probably going to be disappointed. It takes a whole lot of effort to become a competent programmer. I laid out my recommendations in one of my earliest posts. I call for learning not just the syntax but also the essentials of computer science. You can program without these but if you don't pick them up you'll never be great. Learning them, however, takes a lot of time and effort.

Most of the time employers won't give you that time. They want you to be productive and anyone learning to program is not productive. The simplest things take a long time. There is almost always a more competent programmer on the team somewhere and if work needs to be done, it will be given to him. It's not that most managers discourage learning to program. They'd like it to happen. They just won't often budget enough of your time to actually do it.

Now that I've said how hard it is, are there things that testers can do to increase their odds? What about test managers? I'll cover the issue from both perspectives in future posts.

* It is important to note what I'm talking about here. It's not too hard for someone to teach themselves enough C# to write an ASP.Net page or enough perl to parse some log files. That, however, is a far cry from being able to write a test harness, analyze performance, or automate the testing of a COM object.

Wednesday, April 18, 2007

How Much Better Is High Definition Video?

Moving to HD is expensive. You have to buy a new TV, new DVD player, and possibly a new receiver. If you've ever wondered if HD-DVD (or BluRay) is worth it, this demo I found may be interesting. These are a series of pictures taken from the DVD and the HD-DVD or BluRay of the same movie. More than just a series of pictures, they are flash animations. You can drag a bar back and forth across the screen and see the differences. Give them a try. The differences are stark. Here are the two I recommend:

Casino Royale - look at the horse and the bikini.
Serenity - look at the lights on the ships.

Link via the HTGuys.

Tuesday, April 17, 2007

Do You Have Golden Ears?

Here is an interesting demo I heard about on This Week in Media. It contains 2 audio clips in both 128 kbps and 256kbps mp3. One clip is Mozart and the other from R.E.M. The Mozart clip has a lot of strings in it. The R.E.M. clip has background vocals. Both are where you might find the differences represented. To be honest, I can't hear a clear difference. The strings feel a little more mushy but that could be psychoacoustic too. Certainly it's not the sort of thing I would notice without an A-B comparison. Give the demo a listen. Can you hear the differences?

Windows Media for Firefox Users

Microsoft just released a Windows Media Player plugin for Firefox. This will allow people who use firefox to view web videos being broadcast in WMV format. I'm not actually a firefox guy but I know many that are. This is a cool development.

Monday, April 16, 2007

The Purpose of Coding Standards

I'm involved with a group at work crafting the coding standard for test developers across much of Windows. As part of this exercise, it became necessary to discuss what the purpose of a coding standard is. Here are my thoughts.

There are two major camps when it comes to coding standards: those who want the standard to bring uniformity to code and those who want to ban dangerous practices and idioms. Many who drink from the eXtreme Programming trough fall into the first camp. I fall into the second. To see why this distinction matter, consider the eternal question of coding standards: Which braces style is better? K&R or Allman?

Those who want uniformity are forced to choose a style. It usually doesn't matter which one, but there has to be a uniform implementation in the code. The argument goes like this, "Everyone codes to the same conventions, so that no matter what code you read (and even if you do practice ownership you will be reading code), it looks familiar. We need every bit of help we can get in understanding the system. Making it all look alike is a key aspect of that." It seems to be asserted more often than proven that uniformity makes sharing code easier. Because this is asserted without real proof, the extent of its truthfulness is easily made all-encompassing.

There is some truth here. If the vocabulary of the code is different from one person to the next, uptake will be slow. By vocabulary, I mean the practices and idioms. Perhaps the naming conventions of the methods and variables matter. Certainly the extent and type of inheritance make a difference. Use of libraries like boost, STL, ATL, MFC, etc. can be problematic if not everyone is familiar with them.

However, is this level of standardization really necessary? The XP people (at least some of them like Ron Jeffries) seem to say that even small differences like bracing styles make it difficult to work with shared code. Unfortunately, most of the literature advocating such a stance is short of justification. Others seem to advocate very terse standards. At some level differences in style do matter. At many other levels, they don't. Where is the line? We hire smart people. Let them think. Prescribing everything that could be done in a programming language is not only very tedious but also gives no room for people to use the tools in the best way given the situation.

Those that don't demand uniformity are allowed to give a different answer. If there is a big advantage to one style over another, they can advocate it. If, however, there is not--and in this case there isn't--they can avoid that battle altogether and allow different authors to choose whatever style they are comfortable with.

In my opinion, a coding standard should serve two primary purposes. It should preclude those practices that are likely to cause bugs and prescribe those that avoid them. It should also preclude practices that make maintenance difficult and prescribe those that make it easier. Thus coding standards are right to talk about comments or initialization order. They should shy away from topics which are more matters of opinion. The correct bracing style and the use of Hungarian are classic topics that I don't think belong in a coding standard. The best way to handle those is to require that people maintain whatever style the document was created with. Don't intersperse your K&R bracing style with someone else's Allman. And whatever you do, don't go changing already existing code to remove/add Hungarian notation or bracing, etc.

A coding standard which tries to stick to only those items that affect maintenance or cause bugs is much less contentious. It's easy to defend and doesn't tend to step on toes. When it does, you have good reasons to tell the violator why his code is likely to cause bugs or will be hard to maintain.

What are your thoughts? A lot of people are in favor of very restrictive coding standards. If you're one, let me know what you perceive the benefits to be.

Saturday, April 14, 2007

Why Windows Can't See 4GB of Memory

Interesting post by Hilton Locke about why installing 4 GB of RAM on a 32-bit box doesn't actually give you 4 GB of RAM. Instead it gives you something like 3 GB of RAM. A 32-bit operating system should be able to access 4GB. Worse, a 64-bit operating system might not be any better. What's happening? The short answer is that a lot of space it taken up by memory-mapped IO. The RAM can't be seen because that space is used for IO devices. I hadn't considered that. For the long answer, read the whole post. There's a lot of interesting info in there.

My New Addiction: Puzzle Quest

I don't do a lot of gaming. Once upon a time I did, but now that I'm married and have kids, my gaming time is pretty restricted. Last week I was carpooling with a friend and he handed me his Nintendo DS and said "Try this." It was running a new game called Puzzle Quest. I played for the whole drive and promptly ordered it from Amazon. You see, the game is popular enough that it's completely sold out locally. Since it arrived I've been playing it nonstop.

Puzzle Quest is an interesting game. It's a mix of an RPG with Bejeweled. That sound strange, but it's actually quite addicting. You have the interesting puzzle game play of a casual game mixed with the advancements of an RPG. There is a story but it's just an excuse for your character to get into fights, earn experience, and level up.

The game makes liberal use of the Bejeweled mechanic but subtley changes it from task you task. When you are fighting a monster, making matches either gains you mana or inflicts damage on your enemy. When trying to capture a creature, you have to clear the board. When learning a new spell you need to match specified numbers of each color jewel before running out of matches. To add to the mix, during a fight you have spells available to you. Rather than making a move (swapping two jewels), you can cast a spell. Spells may damage an opponent, heal you, or affect the board. Carefully using combinations can be quite effective. The more you play, the better your character becomes. You learn new spells and thus the way to beat an opponent keeps changing.

So far the game is maintaining its charm. The advancements and the new abilities keep me coming back for more.

Friday, April 13, 2007

Getting Real In Software Development

At the recommendation of one of our designers, I just finished reading the book Getting Real by the people at 37 Signals. This is the company that created Ruby on Rails and several web tools for managing business. This book distills their philosophy of software development down into a series of 91 short essays. The philosophy track very well with the concepts espoused by the Agile movement. I found most of what they had to say thought provoking and much of it useful. As with any book on this subject, you have to be careful to tailor the advice to your particular circumstances.

The basic tenets of the book are to think and work simply. Products should be focused without a lot of disparate features. They should do one thing and do it well. They should ship early. The development process should be streamlined without excessive documentation. To facilitate this, the teams should be small.

The book forced me to re-examine the way I think about features. Is there a more simple, first approximation that we can make? It's too easy to shoot for the moon up front. That can be daunting though and perhaps deter even starting. Think about the minimal amount of work needed to get the job done. Flesh things out later.

Here are a few of the points I found most interesting in the book:

Make Opinionated Software - Most of the time users don't need all of the switches and options we allow. Make a logical decision which satisfies most people. That's better than overwhelming people with a myriad of subtly different choices. I think of a certain API I worked on years ago that had something like 4 different synchronization models. Sure, there was always the perfect one available, but you had to think about it. Two (sync, async) would have been better.
Don't Do Dead Documents - If the document won't substantially affect the process, don't make it. Don't write up a spec that gets thrown away as soon as coding starts.
Less Software - If you can solve 80% of the problem with 20% of the code, consider if that is enough. It's easier to maintain the 20% than 5 times that amount.
Get Well Rounded Individuals - Don't hire for specific skills. Hire for smarts. Smart people learn and you'll probably be doing something different in a few years anyway.
Ride Out the Storm - Don't let a short-lived firestorm change the way you do things. Wait for things to calm down before making decisions. This applies well beyond software.

Much of what this book suggests applies a lot better to a small team doing web software than to a larger team doing desktop software. The focus on simplicity only goes so far. There is a time and a place for complexity. Word is more complex than WordPad but it also solves a lot more problems. Similarly, Vim is a lot more complex than notepad but it's a lot better for the task of coding. Getting real doesn't deal with the need for that complexity. I tend to think 37 signals would stop at WordPad. They do have their own word processor and it's intentionally simple (although not quite WordPad simple). Still, there's a lot of insight in their philosophy. Consider it carefully before dismissing it. I recommend the book. It's a quick read and I'm sure you'll learn something. I did.

As far as I can tell the book is available only from 37 Signals itself. It isn't on Amazon. You can buy it from the web site or view the free HTML version online. I like killing trees so I read the paper version.

Thursday, April 12, 2007

April Netcast Update

About time for another update of my netcast list. It's changed a little and some have moved locations. It's still fundamentally the same list though.

This Week in Tech - Leo Laporte hosts a roundtable discussion of the news of the week.

This Week in Media - Technical and political discussion of media production. Nte the new address.

Windows Weekly - This month saw an interview with Dave Caulton of the Zune team.

The HDTV Podcast - The latest in HDTV news and equipment reviews. Only 1/2 hour.

The Dice Tower - Weekly discussion of boardgames.

Major Nelson - Larry Hyrb of the XBox Live team talks about XBox 360 gaming. News and interviews.

Top 10 Most Influential Amiga Games

I have a soft spot in my heart for the Commodore Amiga. It came out 22 years ago in 1985 and was way ahead of its time. It had a GUI, ,stereo audio, hi-res color graphics, hardware-accelerated graphics, etc. It had all this years before the PC. I didn't jump to the PC until 1995 because it was only then that it had managed to surpass the Amiga. It was the Amiga that first brought things like 3D rendering, digital video editing, and looped audio creation to the mainstream.

Given the great multimedia capabilities, the Amiga was also a great gaming computer. Wired just published an article listing the top 10 most influence games on the Amiga. I played most of these and enjoyed them. Games listed include:

Defender of the Crown - Amazing graphics for its time. Spent a lot of time playing this one.

Sensible Soccer

Speedball 2

Syndicate - actually the first SVGA game on the PC I was aware of

Lemmings - really cool platformer. Now available on the PSP.

Pinball Dreams

Cannon Fodder

Shadow of the Beast - First arcade-quality side scroller on a PC.

Another World - One of the first vector-based adventure games.

Worms -didn't know this was an Amiga title.

To this list I would add:

Blood Money - first arcade-quality side-scrolling shooter on a PC.

F/A-18 Interceptor - first 3D, color flight simulator on a PC. This game sold me on the Amiga.

Battle Chess - Animated Chess. The rook eating the queen was one of my favorite moves.

Dungeon Master - Established the 3D dungeon genre in something other than wireframe graphics. Ultima Underworld definitely followed this trend and arguably, titles like Wolfenstein did too.

4/13 Update - Added Dungeon Master to the list. Definitely influential.

Monday, April 9, 2007

Death of the Floppy Drive

The floppy disk has been dead for a while now. It's just too small and too slow. I haven't really used them for anything other than a boot disk for years. Even network drivers don't fit in 1.44 megs anymore. USB Flash drives serve the purpose of floppies now. However, I didn't realize just how dead they were until yesterday.

While rummaging through an old laptop bag of mine, I ran across a disk from college which contained my senior honors thesis on it. I have a web-copy of the document but had managed to lose the original word format version. It was when I went to go retrieve the document from the disk that I realized just how dead floppy disks are. Out of the 6 computers in my house (4 desktops, 2 laptops), only my kids' computer still even has a floppy drive in it. While I have drives sitting around, I don't bother to hook them up any more when I build a new computer.

I don't know how typical my experience is but most new machines don't come with floppy drives any more. The venerable 3.5" floppy disk is so antiquated that it is becoming difficult to even find a drive to read it in. What has been one of the most stable parts of the PC since near its inception is now nearly extinct. How much data is sitting around on disks in drawers, filing cabinets, etc. that we'll be unable to access very shortly?

Saturday, April 7, 2007

Breaking Down the Test/Dev Barrier

Similar to the solution I consider in my post on single-function roles, the Braidy Tester has a provocative new post entitled, "Let's Go Bust Some Silos" wherein he asks what happens when you get rid of the test/dev silos and have everyone work together under one roof. Instead of having one group write dev code and not test it much and another team who wrote tests but didn't understand it much, have a single team which writes and tests fluidly.

One of his commenters, Jim Bullock, raises some good concerns about such an arrangement. Is it really possible for someone focused on the creation to be objective enough to test? Do we need someone tasked with breaking things to get over the tendency to see ones own success? He also suggests that having test/dev silos help different sorts of personalities get along. There might be some merit to this. On a team where test and dev are integrated, a non-coding tester is likely to be treated as inferior. On a test-only team, their value is probably more highly regarded. This need not be so, but too often it is.

In considering this question, it is interesting to consider why we have separate testing teams to begin with. How did they come about? At one point all you had were developers. They tested their own stuff. Later, independent testers came into the picture and organized on independent teams. Why was that? If anyone was there for this formation or knows of books which speak of it, let me know.

I suspect that this happened because developers were found too busy to give sufficient time over to testing. The desire for more features made it hard to take the time to test. In a day without unit tests, the time it took to test was very large. Stopping to do so might mean missing a deadline. Based on that, it makes sense to bring in someone whose whole time is taken up by testing. There is no pressure for that person to add features at the expense of testing. Today, however, when we have automated tests and unit tests which can give us a lost of testing without a lot of time, do we still need this separate role?

Friday, April 6, 2007

Can Code Be Truly Self-Documenting?

I know I'm stepping into the middle of a holy war here but I've been in some conversations on this subject lately and thought it might be worth laying out my thoughts. Recently a coworker in another part of the company told me that they are not allowed to put comment in their code. The argument is that code should be self documenting and any comments in the code will become incorrect over time. Is that really true?

The idea stems from the Extreme Programming(XP) movement although it could be a misinterpretation of their views. Certainly Vasilli Bykov argues that it is. They argue against large amounts of documentation for sure. To an extent, they are right. At some point the returns on documentation diminish. A lot of work can go into creating detailed documentation which quickly get out of date. People update the code without updating the comments or the specs and suddenly they are worse than worthless. Is that then a condemnation of all comments? No.

First off, let's take on the idea that code is self-documenting. It is not. Not fully anyway. Code is ultimately designed to tell a computer how to accomplish a task. It's not foremost intended to tell a human how to accomplish a task. For that we have prose. While code is often readable by a human and should be made as easily intelligible by a human as possible, that is still not its primary task. Anyone who has tried to take over an undocumented code base--even one with clean code--will tell you it is difficult. You have to read the whole thing at least twice. Once to build up your token list and the second time to decipher how the tokens interact. Not everything is obvious without tracing out the actual execution in your mind (or a debugger). This is a time consuming process. And don't even get me started on "modern" programming fancies like keeping each function to a screen and deep callstacks. Those have terrible implications for readability.

Comments have a lot of utility in programming. They tell the next user what to expect. There are two sorts of comments that are most useful. High level comments help the next guy understand how everything works together. Low level comments can help explain a complex piece of code or justify a particular decision.

High level comments might be class level or even file-level. They explain what the intent of this part of the program is, some information about how the various classes and functions interact, etc. When first tackling a new codebase, these kinds of comments can be invaluable. Trying to build up an understanding of the architecture of code by reading each function is like trying to see the pattern in a mosaic using a magnifying glass. It's really hard. It's better to step back and take in the whole picture at once.

I once had to make some modifications to Postgresql for a class. To do this I had to understand how the various pieces worked together to make sure I modified all the right parts. There were no specs available but most of the files had a header block which explained what the functions and data structures in the file did. This proved invaluable. Instead of having to go understand each structure, then see how the functions used them, then finally to understand how the myriad functions interacted with each other, I could read this synopsis and focus on just the parts that mattered to me. I also quickly got a sense of whether I was making modifications in line with the original intent or not. Without any comments, I would have spent much longer trying to accomplish the same task.

Low level comments are usually interspersed within a function or method. They should serve two primary purposes. First, they should explain any complex actions going on. Math is notoriously hard to understand from code. This becomes even more true if it is optimized. Comments telling you that a lookup table is being used to implement clipping is a lot easier to understand than to go look at the table and surmise its purpose from the values in it. The other really valuable purpose is to explain any deviation from standard practice. Sometimes the obvious solution is incorrect. There are bugs caused by side effects or subtle corner cases which go unnoticed. If you were to write the code to fix the bug in those cases without comments, the next guy is likely to undo your fix and recreate the bug.

Other sorts of comments can be useful too. A header block on a function or method describing the purpose and what each parameter does is a lot faster to read and interpret than trying to divine the same information by reading the whole function. It's not that it cannot be done. It can. It just takes a long time.

So code can't be fully self-documenting. But if the comments are out of date, isn't that worse than no comments at all? An argument can be made that it is. It takes a while to notice that the comments are wrong and then you have to go back to the code anyway. In that case, you might as well not have had any comments. This argument, however, is based upon the premise that the comments will inevitably become incorrect. I dispute that. The answer is simple: update the comments when you change the code. When code is refactored, the comments must be refactored as well. Not doing so is just as bad as not running the unit tests or not checking return values.

Of course the response is that this never happens. Does that have to be the case? Changing code without changing the comments is introducing a bug. Not a computer-level bug but a programmer-level one. This is poor programming. Don't do it. Code reviewers should be vigilant for this sort of thing and flag any wrong comments as errors. Once upon a time no one tested the code they wrote. No one had it reviewed. No one wrote unit tests. Most of these are part of a standard best-practices regimen today. Can't updating comments just be added to the list of best practices? I see no reason it cannot.

Comments are, in my mind, an indispensable part of healthy software. Good programming is not just about communicating with the computer but also with the next programmer. At some point you'll move on and someone else will have to work in your codebase. At that point comments become important not only for reducing ramp-up time but critical to avoid making the same mistakes twice.

Redfin: Real Estate Revolution?

Ever since I sold my first house, I've thought that the real estate business was in need of a shakeup. The typical cost of buying or selling a house is 6%. The buyer's agent gets 3% and so does the seller's agent. On a $100,000 house that's not a big deal. However, houses in the Seattle area average $430,000. That means the commission is $26,000. That's a pretty big sum of money. The buyer rarely notices this cost because it is hidden from them. The seller, however, notices that they are only getting $400,000 from their $430,000 sale. The transaction costs are really high. That is definitely a scenario ripe for disruption.

A Seattle-area real estate company called Redfin is trying to be this disrupter. Redfin realizes that most homes are found on the web. In those circumstances, agents don't deliver nearly the value they once did. Because of that, they offer a no-frills real estate service at a fraction of the cost. Redfin lets you do most of the legwork and in return refund most of the commission to you. That could save $10,000 the seller or buyer on the typical Seattle-area transaction. A friend of mine recently used them and found them to be quite efficient. Most people are already familiar with the area they are looking at and don't need an agent for anything other than showing them the inside of the house and helping them with the paperwork. If you are comfortable with the process, Redfin could be very useful.

Scoble posted an interview with the founder. He (Scoble) spends a lot of time telling us about his recent move from Seattle to Silicon Valley but there's a lot of interesting information in there too. There's also a good article in the Seattle Times discussion the business model.

I'm very interested to see where all this goes. I suspect that in the next decade (real estate moves slowly), the way we buy and sell homes will be radically different and a lot more streamlined. Markets don't like inefficiencies. The web likes them even less.

Yet Another Way To Connect Your PC and Your Screen

First it was VGA, then DVI, then HDMI, now we add DisplayPort. Engadget is reporting that DisplayPort was just ratified by VESA. DisplayPort is apparently compatible with DVI which is also compatible with HDMI. Why do we need yet another connector? I'm not really clear on that. It is apparently not encumbered by IP like HDMI is. It may be very good technology but I'm not sure that justifies the confusion this is sure to cause in the market if it takes off.

Monday, April 2, 2007

A Crack In the DRM Wall?

...More like a gaping hole.

To date there has been a unified face put on major digital media assets. Whether TV, Movies, or Music, Hollywood et al have not released them without DRM. Sure, there have been some independent labels and even artists who sold their music without DRM, but none of the big boys have done it. The reason was that piracy would be too high. Today we see a tectonic shift taking place. EMI--one of the big four RIAA members--just announced that it will be selling non-DRM'd music. Not only that, but the quality of the DRM-free music will be superior to the music it sells in DRM'd form. The cost: about 30% higher. The report says that this will include "all of its digital repertoire." Amazing. That's a pretty big change in position. Perhaps the recent drop in CD sales has them scared.

I've long held that the labels could sell a lot of music if they just made it cheap and easy to get ahold of. Why risk the lawsuits, viruses, and slow links of KaZaa and company if for $1.29 you can buy the song without DRM from a reputable source? Purchased digital music has never excited me because of the DRM. I've never bought any digital music because I'm unconvinced that the mechanisms to play it will last very long. I always buy on CD and rip. Now that the music is unemcumbered, I may have to rethink that policy. It will be interesting to watch how sales go now that this change has taken place. How many others like me are out there?

The other aspect I'll be intently watching is how the other labels react. Do they jump into the water too? It seems like they would have to. If I am torn between an EMI and a Sony artist, I'll buy EMI because it's higher quality and unencumbered.