There was this article at BioIT-World this week about a Malaysian company named Synamatix
...Synamatix’s approach is to find patterns in sequence data and identify relationships between the patterns. This information is used to infer the function and significance of various patterns.... automatically learns and identifies similar patterns from raw data sets and stores each unique pattern only once. This helps deal with scaling (less data needs to be stored) and computational speed.
For those of you interested in data analysis (in and out of databases), you can read the company's white papers where you might get a hint of what or how they did it...


Comments
I read a few of their
I read a few of their technical papers, rather light on deails. I am also not encouraged by their claims:
and
. Maybe I'm missing something...
What's missing?
It looks like uber indexing system. The performance would therefore come with rapid access to this index during query operations. The BLAT program does something like this for UCSC genomes.