PubChem and ChEBI

Interested to see that two small molecule databases aimed at biologists are entering the public domain.

PubChem and ChEBI

anyone care to comment on the pros and cons of these systems from the bio point of view?


Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

ChEBI and PubChem

A summary of the state of play with these two systems can be found on the Nature site.

db


SMID

You may also want to take a look at the small molecule interaction database (SMID). Its from the same people that have brought us BIND.

SMID


PubChem

Hi,

I think PubChem is rather innovative and includes data that is not readily available elsewhere, such as all the small molecule ligands in the MMDB/PDB database. The cross-links to other sites is really cool. I am still learning how to best use the system.

FYI...

PubChem has over 825,000 structures, ~648,000 of which are unique. The data sources currently include: BioCyc (~1,400), ChemBank (~4,700), ChemID (~208,000), DTP (~268,000), KEGG (~11,000), MMDB/PDB (~20,000), MOLI (~1,900), NIAID (~114,000), NIST WebBook (~47,000), and NIST/EPA/NIH (~147,000).

I hear they are going to add more large data sets and that several commerical compound suppliers want to put their structures in there.

I think there is little competition with other systems. It is mostly a repository that is cross-linked with literature.

Check it out!

http://pubchem.ncbi.nlm.nih.gov


Chembank

You may also want to look at Chembank, which focuses on the biological properties of small molecules. See Stuart Schreiber's group for applications to genetics.


Small molecules for system perturbation

Not that I use small molecules, but one possibility would be to accumulate a database of useful perturbations. Small molecule A inhibits the interaction of protein1 with protein2 but not the interactions of protein1 with protein3. This type of data is very useful if you want to study biological systems in general.

You could imagine a program that would use cellular network databases and enzyme properties databases to simulate say a pathogen and then retrieve from a small molecules databases the useful combination of molecules that would do the biggest damage for this particular pathogen without hurting the host.


PubChem looks good

I just glanced over them - I like PubChem and the way it integrates into the NCBI website/database structure. I imagine they'd be of use primarily for bioassays - another use might be in docking simulations, in which case some way of converting to the required file formats would be handy (and may exist, as I say I just glanced).

What's amusing is the way in which EBI/NCBI claim to be complementary and serve users each side of the Atlantic, but are clearly in competition in some areas, as here. You see this with other resources, e.g. TIGRFAM/PFAM.


Ad hoc

Hi David,

We have a farily ad hoc approach to administration here at nodalpoint. We try to strike a balance between self promotion and contribution, which is why your two stories are currenlty in the moderation queue (I suspect). I'll consider moderating these up to the front page if we loose the orange XML button in the post. People can find your site via your nodalpoint user page, which is generally what people will do if they find your posts interesting.

Update: published, minus the XML button.

As to the usefulness of these small molecule databases, more information in the public domain is a good thing. I don't do small molecules so I'll wait for someone else to enlighten me...