ChemDB update - full-text search and virtual chemical space

UNLABELLED ChemDB is a chemical database containing nearly 5M commercially available small molecules, important for use as synthetic building blocks, probes in systems biology and as leads for the discovery of drugs and other useful compounds. The data is publicly available over the web for download and for targeted searches using a variety of powerful methods. The chemical data includes predicted or experimentally determined physicochemical properties, such as 3D structure, melting temperature and solubility. Recent developments include optimization of chemical structure (and substructure) retrieval algorithms, enabling full database searches in less than a second. A text-based search engine allows efficient searching of compounds based on over 65M annotations from over 150 vendors. When searching for chemicals by name, fuzzy text matching capabilities yield productive results even when the correct spelling of a chemical name is unknown, taking advantage of both systematic and common names. Finally, built in reaction models enable searches through virtual chemical space, consisting of hypothetical products readily synthesizable from the building blocks in ChemDB. AVAILABILITY ChemDB and Supplementary Materials are available at http://cdb.ics.uci.edu. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  Matthew H Todd,et al.  Computer-aided organic synthesis. , 2005, Chemical Society reviews.

[2]  Pierre Baldi,et al.  Structure-based inhibitor design of AccD5, an essential acyl-CoA carboxylase carboxyltransferase domain of Mycobacterium tuberculosis. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[3]  R. Strausberg,et al.  From Knowing to Controlling: A Path from Genomics to Drugs Using Small Molecule Probes , 2003, Science.

[4]  Brian K. Shoichet,et al.  ZINC - A Free Database of Commercially Available Compounds for Virtual Screening , 2005, J. Chem. Inf. Model..

[5]  Johannes H. Voigt,et al.  Comparison of the NCI Open Database with Seven Large Chemical Structural Databases , 2001, J. Chem. Inf. Comput. Sci..

[6]  Stanislaw Skonieczny,et al.  The IUPAC Rules for Naming Organic Molecules , 2006 .

[7]  E. Corey,et al.  Computer-assisted analysis in organic synthesis. , 1985, Science.

[8]  Thomas Lampe,et al.  Pyrrolidinedione derivatives as antibacterial agents with a novel mode of action. , 2005, Bioorganic & medicinal chemistry letters.

[9]  Pierre Baldi,et al.  One- to Four-Dimensional Kernels for Virtual Screening and the Prediction of Physical, Chemical, and Biological Properties , 2007, J. Chem. Inf. Model..

[10]  Pierre Baldi,et al.  Mathematical Correction for Fingerprint Similarity Measures to Improve Chemical Retrieval , 2007, J. Chem. Inf. Model..

[11]  Pierre Baldi,et al.  ChemDB: a public database of small molecules and related chemoinformatics resources , 2005, Bioinform..

[12]  Pierre Baldi,et al.  Bounds and Algorithms for Fast Exact Searches of Chemical Fingerprints in Linear and Sublinear Time , 2007, J. Chem. Inf. Model..