I'm sitting in on the ontologies and database track at ISMB today, the wireless isn't working in the main conference room, I'm a little fuzzy from drinking German beer last night, but thankfully I didn't find myself wearing my lanyard to the bar or taking my laptop to dinner. Not that I can say the same for others...
Data integration introduction
- How do we control people ?
- How do we maintain consistency ?
- Theme of data integration has been around for a while (early 90s)
- Talking about RDF and data integration
- Raving on basically...
The presenter is Finnish, his accent is a little funny, kind of like listening to the chef from the muppets give a scientific talk, he's cool though...
Data integration and vizualization
- Natural to formal system descriptions
- Problem: huge number of databases (heterogeneous)
- Observations and facts scattered across many many databases
- Many ways to name and describe biological systems ?
Does he mean that name is equal to identifier
- This guy uses XML for data storage and query
- Use maps {dictionaries} from one ID to another
- Something about traversal using these maps
- They have a java gui, with the usual network viewer...
- Shows example of MINT and KEGG integration
This is not really clear how this system works from the talk or how it is more effective than other systems, it really is Yet Another Bioinformatics Data Integration System (YABDIS).
- Discussion of the problems of natural language processing, this is something to do with context based mining...
- Mouse with human legs, example of searching for an orange car, a car made out of an orange ?
Doesn't explain the technical details of how the ontologies are used with the data.
- Uses ontologies to do context mining
- Use clustering in the neighborhood of different genes ?
- Sammon's mapping algorithm is used to map multidimensional distance vectors into 2-dimensions ??
- They want to implement other types of algorithms
Not sure where all this is coming from ?
- Interesting images of yeast metabolic network combined with protein protein interactions, combined into a single network
Not sure why or how you would want to do this
- Performs scaling calculations on the combined network
Skips to something completely different now:
- Exposing subjects predicates and objects from the literature
- Use this to create text mining ontology
- Claims their bioinformatics database/framework can incorporate just about any kind of data ? (models, pathways, chemical information,
I'm always suspicious of this kind of claim, this is complicated stuff, if you system can already do this (as you claim) then shouldn't we all just go home. This guy was pretty nervous, which is reasonable given that this is a n international conference
Questions:
Scaling issues with the maps concept, i.e. if the maps are build manually then how will this system scale ?
Also how about retrieval times for massive XML data sets ?

