Bio-ignorance

Communicating Biology to Computer Scientists

The Human GenomeMany computer scientists and software engineers are not familiar with basic biology or bioinformatics. Many biologists and bioinformaticians are not familiar with basic computer science or software engineering. This article points to some resources that can help with the former, and asks, what can be done about the latter?

Progress in both computer science and biology is closely linked and dependent on people understanding each others strange language, cross-pollinating ideas and creating technology which hopefully has hybrid vigour. So for example, biologists and bioinformaticians have a healthy apetite for all kinds of better, cheaper, faster and sometimes novel computation. This requires they understand basic computer science and software enginnering. In the other direction, computer scientists often need realistic scenarios to motivate the invention, development and testing of genuinely novel technology. As for the software engineers, more on them later...

It sounds great, but before you can even say the words “inter-discplinary”, there are considerable barriers to communication. The various camps speak different languages, and have radically different cultures. To illustrate this communication breakdown, here is a story from the lab where I work. A while ago, I was discussing the Gene Ontology with a colleague, who shall remain anonymous. This colleague was educated, doing PhD level research and what I'd consider a fairly typical computer scientist. Soon the conversation turned to chromosomes, and they asked me:

“What is a Chromosome?”

Initially I was shocked. How could somebody not know what a chromosome was? Had they never read a newspaper? Never watched the television? Surely, most people have at least a vague idea what a chromosome is? After recovering from the shock, I told this person that according to the Gene Ontology a chromosome is “a very long molecule of DNA and associated proteins that carries hereditary information.” Perhaps this bio-ignorance is an extreme case, but unfortunately, it is all too common. Many computer scientists and software engineers I know stopped studying biology as soon as they possibly could, opting for the so-called “harder” sciences: physics, chemistry and mathematics. Consequently, many (but not all) computer scientists are bio-ignorant. What can we do about it? We really need to understand each other if we are going to make any progress. How can we improve communication between biologists and computer scientists?

Part of the solution to this problem is well-written literature that explains basic concepts quickly and clearly without getting bogged down in jargon or stuck on esoteric details, see the references below for some examples. One of my personal favourites is a little book called The Human Genome: a beginner's guide to the chemical code of life authored by Jeremy Cherfas. This book is lavishly illustrated and beautifully written, but most importantly of all at 72 pages it is blisteringly concise, so stands a chance of being read by computer geeks and nerds. It is even funny in places, the Nobel laureate and geneticist Thomas Hunt Morgan is amusingly depicted as a red-eyed wild type, just like the fruit flies he studied. Anyway, I lent my copy of said book to my computer science buddy, and they learnt not just what chromosomes are, but also a little bit about why Biology and Genetics are such fascinating subjects.

The literature listed below can help one-way understanding of biology by outsiders, but communication is a two-way street. What about the other direction? Is there any literature that explains computer science and software engineering specifically to biologists and bioinformaticians? I don't know of any particularly good examples, that are concise, well written and illustrated, but perhaps you do. I've frequently found bioinformaticians and biologists misunderstand what computer science is about, and confuse it with software engineering, but that is another story. The moral of this story is, don't be surprised if people working in different fields to you lack a basic understanding of what you consider fundamental concepts that everybody knows. If they are bio-ignorant computer scientists, you should patiently and tirelessly explain yourself and maybe point to some of the resources below. Maybe we can understand each other just a little better.

References

  1. Anonymous GO:0005694 Chromosome: A very long molecule of DNA AmiGO! Your friend in the Gene Ontology
  2. Alvis Brazma, Helen Parkinson, Thomas Schlitt and Mohammadreza Shojatalab (2001) All you need to know about biology in twenty pages European Bioinformatics Institute (EMBL-EBI) (A technical introduction, written for EBI employees, but useful elsewhere)
  3. Jeremy Cherfas (2002) The Human Genome: a beginner's guide to the chemical code of life (isbn:0751337161) Dorling Kindersley (A quick but informative introduction that your granny could understand)
  4. Jeremy Cherfas (2006) International Plant Genetic Resources Institute (IPGRI) public awareness blog IPGRI, Rome, Italy. (Some deserved nodalpoint Google Juice for these news and press releasess)
  5. Carole Goble and Chris Wroe (2005) The Montagues and the Capulets: In fair Genomics, where we lay our scene... Comparative and Functional Genomics 5(8):623-632 (A paper describing communication breakdown between two different research “houses”, very possibly the only paper on genomics that will make you laugh. seeAlso Shakespearean Genomics: a plague on both your houses)
  6. John Gribbin Dorling Kindersley's Essential Science: Human Genome, Global warming, Expanding universe, Food for the future, Digital revolution and How the brain works www.dk.com (Some interesting books here)
  7. John W. Kimball Chromosomes Kimball's Biology Pages (How does John Kimball manage to write so much good introductory material sabout Biology?)
  8. John Bonham, John Paul Jones and Jimmy Page (1969) Communication Breakdown Led Zeppelin (Communication breakdown, it's always the same, I'm having a nervous breakdown, drive me insane!)


Creative Commons License


This work is licensed under a
Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.


Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Thanks

This is embarrasing. I came here (after picking up nodalpoint on my referrer logs) and read this article's first screenful. Decided it would be useful for a friend involved in managing a bioinformatics project, and wacked the link off to him. Then I read on. Now I will have to explain to my friend that I wasn't just doing a bit of ego-boosting. He'll understand, I hope. But I did want to thank Duncan for the good words.

The mystery, as far as I am concerned, is why Dorling Kindersley pulled the plug on what they promised us would be a long-running series that would grow. I know, the old joke about "how can you tell when a publisher is lying?" applies. But still, I did that book as a loss leader, on the promise of good promotion and more to come. I was fooled. But I remain proud of it and stand ready to help anyone else who wants to get complex ideas across.


DK-heads

Hi Jeremy, fancy meeting you here :) Its a shame DK have pulled the plug on this series, but these titles are all still available. Perhaps DK have trouble competing with Very Short Introductions and Icon Books "Introducing" series?


perennial favourite

Ah yes, interdisciplinary communication, especially between biologists and programmers is a real favourite topic. Thanks for the article and links - it's good to see this addressed from the computer -> biology direction.

I'll start with my own story. A few years ago, I was given the task of redesigning a website for the group in which I worked. The original version had been created using some Mac WYSIWYG package. The boss came in one day to see how it was going and I pulled up some HTML onto the screen. He looked at it blankly. "What's this?" he said. "This isn't how I did it." "Um, I think it is", I replied, "this is HTML - it's what makes web pages appear." Eventually I realised that there was absolutely no connection in his mind between the HTML and the web page. Although he'd created a web page himself, his software had shielded him to such an extent that he had no concept of the link between code and output. So point one is this:

* Many biologists are passive users of computers. They are no more sophisticated than the average home user - they use Office, surf the web and read/send email. They do not even realise that to program means to make a computer do what you want or that in principle, with sufficient programming skill, anything is achievable.

This is something that has always perplexed me, because these same people are well aware of their limitations. They will spend hours doing tedious copy/paste operations which could be done in seconds using a small perl script. They will even say to you "there has to be a better way to do this". Yet they seem unwilling or incapable of taking the next step - teaching themselves some basic programming skills. The attitude is almost "but I'm a biologist, I just don't do that sort of thing".

In no particular order, here are a few other issues that I think are perpetuating the division between computational and non-computational biologists:

* The education system. We are still splitting people into "hard science" and "soft science" streams at too early an age (high school or early undergrad). It's still the case that people do maths, physics and computing because they care little for fluffy bunny rabbits and people do biology because they think maths, physics and computing is hard and irrelevant to life science. I think the only way to fix this is compulsory maths, physics and especially computing in all first year biology courses. I also think undergraduates need to be exposed to real research much earlier in their courses, so as they can see why these tools are useful.

* Microsoft. As many biologists are essentially home computer users, they believe the pointy-clicky GUI world of Windows to be "normal" and Linux/UNIX to be something niche and strange. In fact the reverse is true - UNIX has been around since 1970, Linux since 1990. This attitude seems to run through our entire society, as when PCs are advertised that can "run Windows as well as Mac", as though that's all there is. You'd think intelligent scientists would see past this - but no.

* IT support for biologists. If you work in an academic setting, IT support for biologists is likely to be woeful. Simply because as outlined above, your average biologist works like a home computer user and uses Mac or Windows. Campus IT services are therefore tailored for this - most of them don't even have a Linux expert on hand. If you want to setup a Linux server or build a cluster, you'll be on your own.

* The amateur nature of bioinformatics. The vast majority of practising bioinformaticians/computational biologists are people who started out as wet-lab research biologists and taught themselves computer programming, just out of interest. This creates a situation where we are seen as being in our own camp. Professional computer scientists take issue with our coding skills, biologists take issue with the fact that we've left the lab. Ideally we'd be at the interface where we can facilitate communication and research in both directions and this does happen in the right environment. Perhaps this will improve as more graduates with degrees in bioinformatics start to trickle out of the system.

I've always been of the opinion that anyone in science should just call themselves "a scientist". A scientist to me is an intelligent person with wide-ranging interests who is able to grasp key concepts from outwith their own field, even if they don't fully comprehend all of the details at first. I also think that we are living in incredibly exciting times. For me, the social changes in communication brought about by the internet and the transformation of biology from a soft science by genomics and computation are the key to blowing away boundaries between traditional research fields and creating truly interdisciplinary science. Unfortunately it's hard not to conclude that the way academia is set up just now is a substantial barrier to this ideal.


Hard / Soft sciences

Hi Neil, I agree with you, the world would be a better place if more scientists were "intelligent people with wide-ranging interests who are able to grasp key concepts from outwith their own field". Unfortunately too many of them are experts in the own fields, and completely ignorant of others. All of the above seems fairly consistent with the Andy Brass rule that "Biology is the subject for people who like science but are scared of mathematics". IMHO you can run, but you can't hide from mathematics because it is fundamental to all the sciences. Biology and bioinformatics are no exception.