archives

Date
  • 01
  • 02
  • 03
  • 04
  • 05
  • 06
  • 07
  • 08
  • 09
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31

Web Services in the World with Google Maps

Bren Vaughan at the EBI has created a map of bioinformatics Web Services around the world using the Google Maps API. This is similar to the EMBL world application mentioned earlier by Greg.


nodalpoint tracking: sufficiently advanced science web technology

Welcome to part two of what could become nodalpoint's very own version of TRACKING: sufficiently advanced technology (maybe sufficiently advanced science web technology). All of the following tips are related to Science 2.0 (if you can think of some better term to use please let me know). Read on for the tips...


Between CASPs Public Meeting

Between CASPs Public Meeting

The CASP organization will be holding a 1-day meeting to report the progress in the area of structure prediction (as assessed by the CASP experiment) to the public. In contrast to the CASP meetings that are exclusively for participants who develop prediction methods, this meeting will be open to the public. Organizers, CASP6 assessors and some of the best CASP6 predictors will present their perspectives on the state-of-the-art in structure prediction.

The meeting will take place in New York, USA at the Columbia University on May 1, 2006. Registration free, but limited to the first 80 registered participants. See Cubic Wesbite for details.


outsourcing computation clusters

Sun recently announced the availability of the Sun Grid Compute Utility for public use. This is a service that I seem to recall HP and IBM talking about for some time. Looks like Sun is giving it a try. From the Sun Grid website: "Sun Grid provides an easy and affordable access to an enormous computing resource for the predictable and all-inclusive price of $1/CPU-hr."

This is basically a homogenous compute farm running Solaris 10 with the Sun Grid Engine for job control. As it is open for diverse public use, it accepts only self-contained programs specifically compiled for the Solaris 10 (32 or 64-bit) platform. This is an interesting development at a time when many companies and academic institutions alike are considering the benefits of outsourcing to data centers. Passing over maintenance and administration of a likewise massive computational center has several advantages; there’s some benefit in Sun (or any centralized compute center, like the Los Alamos Blue Mountain computer cluster ) being able to scale up easily, and heat removal and power consumption spring quickly to mind. However, it's not clear that bioinformatics compute cluster needs mesh well with such a vanilla implementation such as Sun's. Bioinformatics programs, at least on the level I run, tend to be development style programs, which benefit from a customizable compute environment (if you want to install and freeze some versions of software). From an academic standpoint, many grants don’t allow for this type of computing model yet. The pay as you go model does not play well with grant expirations - transferring funds from your grant to a PayPal account may raise a few eyebrows.


The future of computing; science in 2020

Declan Butler gave me a heads-up on this weeks Nature Web focus: the future of computing. All the articles are freely available on-line, curiously this is due to sponsorship from Microsoft. It appears that Microsoft maybe shifting some of their focus to scientific computing, the Nature web special came out of the Microsoft 2020 Science initiative. I previously came across this Microsoft research paper: Scientific Data Management in the Coming Decade which address many of the issues we confront with biological data management. Microsoft research has come up before on nodalpoint.

A couple of highlights from the special: Can computers help explain biology? and Vernor Vinge's The creativity machine.


nodalpoint: tracking the evolution of science

I have a series of posts lined up for the next few days with the loose theme of Science 2.0. Of course I don't like all encompassing marketing terms such as Web 2.0 or Science 2.0, but they are good enough for the time being. In keeping with the Science 2.0 theme, I found this post via my RSS aggegator, and the original was picked up from Postgenomic. Pedro Beltrao has an interesting post on Kevin kelly (author and technologist) who has this prediction about the evolution of science:

Wiki-Science - The average number of authors per paper continues to rise. With massive collaborations, the numbers will boom. Experiments involving thousands of investigators collaborating on a "paper" will commonplace. The paper is ongoing, and never finished. It becomes a trail of edits and experiments posted in real time - an ever evolving "document."

Go read Pedro's post for more thoughts on the big issue here: credit. Bedro also raises the possibility of implementing WIki-Science by authoring a paper on the nodalpoint wiki pages. While I welcome anyone to use the wiki pages to collaboratively write a paper (Google's new acquisition is also an option),  I nonetheless must agree with both Pedro and the first commenter: it ain't gonna happen until those in-charge have a change of mind, and they wont change their minds until we figure out a new funding model that isn't tied directly to credit.

Maybe an alternative approach is micro-publications, short original pieces of science which only tell part of the story. Maybe this way contributions could be tracked and there would then be opportunity to work on more meta-publications (a collection of micro-publications).


Rob Carlson: Synthesis

Here are some follow-up links from the 'Bioware for Dummies' post yesterday. Rob Carlson, who is mentioned in Paul Boutin's article, has an interesting blog where he has been following the spread of avian flu. His focus tends to be on security issues related to biotechnology. He has blogged recently on the global distribution of commercial DNA foundries, the worry here is that unscrupulous synthesis shops may not be screening sequences that customers are requesting (e.g. smallpox). Furthermore who is ordering all this DNA and why ? A second piece that I found interesting was on the dramatic increase in China's R&D budget (from US$12.4 billion in 1991 to $84.6 billion in 2003). The article also mentions the effects on PhD production:

The United States, Europe and Japan still produce many PhDs and create a host of jobs. But China is coming on strong. One wild card is whether Chinese PhDs will stay in the United States or return home. While China's PhD production in the United States has increased, PhDs by US white males has dropped from its peak of about 8,900 in 1994 to just over 7,000 in 2003.

Carlson is also working on a book, Learning to Fly: The past, present, and future of Biological Technology, the draft chapters are available on line.


Biowar for Dummies

I'm a big fan of DIY biotech stories: DNA preps using you kitchen blender, buying used DNA sequences of ebay etc. tech writer Paul Boutin has gone one step further and asked the question: How hard is it to build your own weapon of mass destruction?. For example:

Brent guesses he would need a couple million dollars to whip up a batch of smallpox from scratch. No need for state sponsors or stolen top-secret germ samples. “An advanced grad student could do it,


Integrating BioPAX Compliant Pathway Data

I have been saying for some time now that RDF compliant data formats will lead to effortless integration of biological resources at the level of data model. I even presented a poster at ISMB last year along these lines, of course I presented my poster at the wrong session and nobody gave it a second glance. However in the back of my mind I knew people would get it eventually.

Well it seems that time is now. A group of clever individuals from Standford have aggregated BioPAX compliant data from kegg, ecocyc and reactome and built, surprise surprise, a Pathway knowledgeable. It is too early to tell whether this is the first step on along the path to a semantic web for life sciences. However it is a step in the right direction.

A short note on the use of aggregated rather than integrated in the previous paragraph. When you visit the project's web page the authors use 'data integration' rather than 'data aggregation' to describe the work they have done. The terms 'data integration' are usually used synonymously with 'heterogeneous data integration', thus I don't really see what they have done as being 'integration', in that sense as all the data was in the same format (BioPAX). Regardless, they are using RDF to do this and standards are a good thing (TM). While I am not completely convinced that the W3C semantic web recommendations are perfect, I would prefer to see people work with them rather than continue to invent self contained systems with little added value.

It will be interesting to see the paper once it emerges, so see if they had to deal with any semantic heterogeneity. Look out for personal semantic integration desktop software in the future.