archives

Date
  • 01
  • 02
  • 03
  • 04
  • 05
  • 06
  • 07
  • 08
  • 09
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31

Integrating BioPAX Compliant Pathway Data

I have been saying for some time now that RDF compliant data formats will lead to effortless integration of biological resources at the level of data model. I even presented a poster at ISMB last year along these lines, of course I presented my poster at the wrong session and nobody gave it a second glance. However in the back of my mind I knew people would get it eventually.

Well it seems that time is now. A group of clever individuals from Standford have aggregated BioPAX compliant data from kegg, ecocyc and reactome and built, surprise surprise, a Pathway knowledgeable. It is too early to tell whether this is the first step on along the path to a semantic web for life sciences. However it is a step in the right direction.

A short note on the use of aggregated rather than integrated in the previous paragraph. When you visit the project's web page the authors use 'data integration' rather than 'data aggregation' to describe the work they have done. The terms 'data integration' are usually used synonymously with 'heterogeneous data integration', thus I don't really see what they have done as being 'integration', in that sense as all the data was in the same format (BioPAX). Regardless, they are using RDF to do this and standards are a good thing (TM). While I am not completely convinced that the W3C semantic web recommendations are perfect, I would prefer to see people work with them rather than continue to invent self contained systems with little added value.

It will be interesting to see the paper once it emerges, so see if they had to deal with any semantic heterogeneity. Look out for personal semantic integration desktop software in the future.