Six grueling days and I still have the conference dinner and my poster session to go. I haven't found any of the presentations today particularly interesting so I've skipped most of the sessions. I will get along tonight to the key note at 5:10pm.
I've walled into the IBM demo 40 min late, it goes for two hours, looks like I've just caught the end of the technical introduction. Basically IBM are pushing something called the DB2 "graph extender" (google doesn't seem to have anything on "graph extender"), which by the looks of the last few slides is basically turning DB2 into a graph database. She runs through some of the queries, doesn't look like it is based on RDF or any of the semantic web technologies.
The query language looks specific to DB2, but I only saw two slides. She is moving on to the demo itself now, using simple CGI script interfaces to the DB2 graph database. Showing an example of shortest path queries on yeast data-sets no real merging example ?
The obvious question is how they get this kind of data to integrate and how is the identifier mapping done ? Maybe I missed this
Some has asked where RDF fits in...
Write a wrapper around a site that speaks RDF... answer is not that good, it seems we are at the point were people are now aware of RDF as a significant technology but don't really yet appreciate the technical details. For example one guy mentions that if we grab an RDF graph we could potentially suck down the entire web... this is all because of the fact that RDF uses URIs ? Okay then, we haven't even solved the RDF URI resolution problem, see LSID discussion, URIQA etc. Afterwards I spoke to one of the IBM guys who told me that they have researches at the IBM Cambridge labs who are working on integrating RDF technology into DBM2.


Comments
Oracle are active in this space too
I bumped into Susie Stephens, from Oracle's Life Science marketing group, here at ISMB.
Oracle are active in this space too. e.g. they are a participant in the BioDASH project.
Oracle already has support for graphs (or in oracle's terminology, networks) built into it's database, allowing various types of topological query to be done on the network connections that are stored in the db.
Apparently in 10g release 2 (due out any time now) there will be explicit support for RDF, built on top of this underlying network representation/querying technology.
Earlier presentations at ISMB have mentioned how poorly most current RDF storage/querying technologies perform, as the number of 'facts' scales up. Will be interesting to see how the various technologies in this space evolve, as they start to be thrown at non-toy problems.
Good point. Scalability of tr
Good point. Scalability of triple stores for biological applications (or in general) is clearly something that will need to be addressed. There is a report from SWAD-Europe here on RDF scalability. Relevant quote:
My understanding is that both DB2 and Oracle 10g do graphs on top of their relational dbs.