The value of bad code

Computer scientists often have a hard time getting used to bioinformatics people. They are trained to write readable, documented, non-spagetti-code software, trained for years, they were told about UML diagrams and some even had special courses for it, the once hyped "software engineering". We're taught that this is the most important part of computer science all the time and that losses amount to billions every year because of poorly written software.

In science, everything is different, of course. Documentation does not matter. Coding speed and execution speed does, statistics does. Forget documentation, forget OO versus functional. Forget source code comments. Big, well documented, slow projects are not needed anymore when they finally get finished. Most of software here is throw-away code. (Apart from some rare basic tools like genome browsers, network viewers, alignment editors, aligners, etc)

I wonder if this is only applicable for bioinformatics...

Take Windows 95: Crappy code, but delivered just in time to challenge OS/2. Word was a nightmare at first. Linux is not object oriented in any way. I'm sure that Netscape 3 was a mess.

Most successful software that we use today was half-way finished at the right moment in time. Timing is critical when there suddenly is a new application for computers. I bet that the core was often developed by one single main programmer, who could achieve an incredible number of lines of code per day, as there was no communication needed, no documentation or class diagrams. The single programmer made a product that crashed sometimes, but since it was there at the right moment, testers had it earlier, people started to use it, found bugs which were fixed. I'm sure that it often pays off to be able to hack something together in no time, even if the structure is horribly bad. It gives you a time-advantage.

Then, of course, in real life, one day someone has to sort out that code. But that's a problem then for boring software engineering professors who think that code matters more than time.


Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Re: The value of bad code

I think you're wrong.
I come from biotechnology but I'm moving to bioinformatics.. and I can tell you, well-commented and well-documented code is very important.

Look for example at bioperl and biopython: they are huge pool of bioinformatics functions and scripts, and they are all well-documented and with wikis and many example.

Or you can take emboss, blast, ensembl, and many other software: I think they are well documented, too.
Which programs are you referring to?


throwaway != bad

Sometimes I think that this argument is oversimplified. Biologist-programmers frequently work on code for one, very specific purpose. Some of it is throwaway but in all likelihood, it will contain modules that are useful for other purposes. I agree that we don't always have time to spend on engineering - I've certainly never used a UML diagram. But I use CVS, I try to comment code, I use modules where possible and I revise code to improve its execution, even if it seems to work. Comments don't take much time and make your code self-documenting. Really bad code will always come back to bite you somewhere down the line. So "quick, dirty, throwaway" doesn't have to equate to "bad".


Re: The value of bad code

How much effort should be diverted into keeping code nice and clean? I guess the balance depends on the project: If you are designing the next generation air traffic control system, you may do well to heed at least some of the advice of the software engineering professors. If you are working on something more innovative and less well defined, you are probably better off ignoring most of it. Most projects of course fall somewhere inbetween. Problems arise when people decide on how much effort goes (or does not go...) into the code based on personal attitudes rather than the circumstances, but these cases usually take care of themselves :-)


sometimes bad code is good

I acknowledge that the post was poorly written. I still think there is a lot of value in being able to hack something quickly together. This is the old throwaway-code discussion rephrased. On the practical side, I've copied a concept from the UCSC source code: a directory called "oneshot". I'm putting everything into oneshot that has virtually no documentation and serves mostly only one purpose, to convert from one fromat to another or similar. It's in the path and if I ever think that I might have written something before I just do fileformatname[tab] and bash will tell if there is some old code about this lying around...
It sounds silly, but that saved me a lot of time and it keeps throwaway code (oneshot) seperate from the other scripts (usr/bin). I think it is important to deactivate the "I have to restructure/optimize this"-reflexes when programming from time to time. I think knowing when it is really worth to invest a lot of time is a trait of productive programmers. I'm often more like a PhD-geek 100% focused on his software, wasting a lot of time on unnecessary improvements, instead of concentrating on the real task.