Thursday, December 25, 2008
Wednesday, December 24, 2008
Now that I've completed the final class toward my Masters Degree I have the time to explore some things of my own choosing. One thing I intend to do is to learn a new programming language. This article I discovered via Reddit is a good place to start. It lists 10 languages worth learning. These are the up-and-coming languages, not the current hot topics like Python or Ruby. Interesting items on the list include Squeak, Haskell, Clojure, and PLT Scheme.
Tuesday, December 23, 2008
Similar to thinking of Joel Spolsky (and me), Bjarne Stroustrup (the inventor of C++) says the way we teach CS today is broken. That is, it is at odds with the needs of the industry. Having just completed a Masters in CS I can say first hand that this is true. Some of what I learned was applicable to my work at Microsoft, but I could see the tension in the higher-level classes especially between the needs of preparing someone for research and preparing them for industry. Research is not about programming. It is about thinking in new and innovative ways about problems. Industry needs good programmers and the best solutions to problems. CS research students often ignore the best solutions to problems because those have already been discovered and you can't get published writing about the same thing again. Industry prefers the proven solutions.
Stroustrup bemoans the use of Java in so many programs today. He doesn't attack Java per se but rather the emphasis on the libraries. Students don't learn to solve fundamental problems (like sorting) because the library can do that for them. It is easy to become an expert in Java rather than an expert in programming, something Rovert Dewar states well in this interview. When the Java (or C# or Ruby) language wanes in popularity, it is the underlying CS skills that translate, not the intimate knowledge of the Java collections hierarchy. I have a more fundamental problem with higher-level languages for learning. It is important to learn how the machine works because someday everyone will be stuck with a leaky abstraction and will have to solve the deeper problem. Without an understanding of what is below the byte code (or code), doing so can become very difficult. It is easy to code oneself into a corner without knowing it if the understanding of the underlying system is missing.
He also talks about the lack of quality in the code written by students. Too many projects are littered with magic constants, not well factored, etc. This probably comes from the fact that code written for classes is thrown away. Most of my professors never even asked for our code let alone actually read it. A TA in my OS class actually said they didn't care about memory leaks. In a freaking OS class of all things!!!
- Learn a lower-level language. Joel Spolsky says everyone should learn C. I'm inclined to agree. It's not too hard to learn and it is as close to the metal as you usually want to get.
- Learn to express your ideas in more than 1 language. Each is a little different. Learning other languages gives you tools to think about problems in new ways. This also ensures your knowledge is about the algorithms, not the base class libraries.
Sunday, December 21, 2008
Tuesday, December 9, 2008
Friday, December 5, 2008
In my opinion estimating how long it will take to write a piece of software is difficult if you haven’t done it before and with software we never have. The more experience you have, the more you’ll have done similar things and thus the more accurate your estimates will be. To help build this experience faster, it is important to track how long things actually take.
Many software developers just start working and don’t really plan. They have a vague idea what they want to accomplish, but haven’t really thought it through. This inevitably leads to the task taking longer than expected. There is a great discussion of this on the Stack Overflow Podcast. Check out episodes 5 and 16 for the before/after. There are transcripts available if you don’t have time to listen. The reason why things take longer is because without careful planning, there are a lot of pieces of work that aren’t taken into consideration. Basically the programmer makes an unfounded assumption that he doesn’t have to do some work that indeed he does. In my experience, this is one of the largest contributing factors to things taking too long.
The good news is that most of these unconsidered items can be teased out if sufficient thought is given to the problem. They are known unknowns. The questions are usually known, just not the answers. There are techniques for doing this.
The simplest is breaking down the problem into small pieces. On our projects we never let items get bigger than 5 days of work. This is probably too large. If you are estimating in chunks larger than a day or two, you are probably glossing over important details.
Writing a spec in some level of detail is also important. Be honest with yourself and explain how things will get done, not just that you need to do them. “Break class Foo into two classes” may not be enough detail. What are the two other classes going to do? How will they work together? My rule of thumb is that a spec isn’t done until someone could pick it up cold and implement a reasonable facsimile of what you intended. If they can’t, it is too vague.
For estimation of larger projects, there is a process called Wideband Delphi estimation that can help to flush out the underlying assumptions. It is very lightweight and can be done by a few people in a couple hours for many projects.
Once the task is broken down into small segments and these segments are estimated, it is important to track how long each item really took. Be honest. You are only cheating yourself if you report a 3-day work item as having taken 1 day. This helps to build experience and make better estimation possible in the future.
One more thing to consider is whether you are staying on task. When it isn’t something we failed to consider, the next highest cause of under-estimation is distraction. How much time are you actually spending on a project? If you spend an hour a day working on it and the other 7 hours doing e-mail, meetings, surfing the web, talking to people, etc. you won’t finish on time (unless you account for an 87.5% buffer).
Speaking of buffering, it is my experience that it is easier to estimate uninterrupted time than to try to estimate how much time all of the distractions will take. On our team we estimate each work item in a vacuum. This will take 1 day of coding and debugging time. Meetings, vacations, e-mail and other distractions all fall into another category we call “buffer.” We usually estimate 30% of an average person’s time will be spent in this buffer state. This varies with the project of course. Separating the two allows us to better see where the time is being spent. It is a much different issue if someone takes 5 days actually working on what was estimated as a 2-day problem than if they take 2 days coding but are distracted with other items for 3 days.