EDULISS: a small-molecule database with data-mining and pharmacophore searching capabilities

We present the relational database EDULISS (EDinburgh University Ligand Selection System), which stores structural, physicochemical and pharmacophoric properties of small molecules. The database comprises a collection of over 4 million commercially available compounds from 28 different suppliers. A user-friendly web-based interface for EDULISS (available at http://eduliss.bch.ed.ac.uk/) has been established providing a number of data-mining possibilities. For each compound a single 3D conformer is stored along with over 1600 calculated descriptor values (molecular properties). A very efficient method for unique compound recognition, especially for a large scale database, is demonstrated by making use of small subgroups of the descriptors. Many of the shape and distance descriptors are held as pre-calculated bit strings permitting fast and efficient similarity and pharmacophore searches which can be used to identify families of related compounds for biological testing. Two ligand searching applications are given to demonstrate how EDULISS can be used to extract families of molecules with selected structural and biophysical features.

[1]  F. Lombardo,et al.  Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. , 2001, Advanced drug delivery reviews.

[2]  A. Anderson The process of structure-based drug design. , 2003, Chemistry & biology.

[3]  Xueliang Fang,et al.  A Web-Based 3D-Database Pharmacophore Searching Tool for Drug Discovery. , 2002 .

[4]  W. Graham Richards,et al.  Ultrafast shape recognition to search compound databases for similar molecular shapes , 2007, J. Comput. Chem..

[5]  Darko Butina,et al.  Unsupervised Data Base Clustering Based on Daylight's Fingerprint and Tanimoto Similarity: A Fast and Automated Way To Cluster Small and Large Data Sets , 1999, J. Chem. Inf. Comput. Sci..

[6]  J. An,et al.  Structure-based virtual screening of chemical libraries for drug discovery. , 2006, Current opinion in chemical biology.

[7]  Renata C. Geer,et al.  The NCBI BioSystems database , 2009, Nucleic Acids Res..

[8]  A. Mesecar,et al.  Metal-ion-mediated allosteric triggering of yeast pyruvate kinase. 2. A multidimensional thermodynamic linked-function analysis. , 1997, Biochemistry.

[9]  Ovidiu Ivanciuc,et al.  Design of Topological Indices. Part 10.1 Parameters Based on Electronegativity and Covalent Radius for the Computation of Molecular Graph Descriptors for Heteroatom-Containing Molecules , 1998, J. Chem. Inf. Comput. Sci..

[10]  Xin Wen,et al.  BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities , 2006, Nucleic Acids Res..

[11]  Renxiao Wang,et al.  The PDBbind database: collection of binding affinities for protein-ligand complexes with known three-dimensional structures. , 2004, Journal of medicinal chemistry.

[12]  Tudor I. Oprea,et al.  Pursuing the leadlikeness concept in pharmaceutical research. , 2004, Current opinion in chemical biology.

[13]  Paul D Lyne,et al.  Structure-based virtual screening: an overview. , 2002, Drug discovery today.

[14]  Peter Willett,et al.  Maximum common subgraph isomorphism algorithms for the matching of chemical structures , 2002, J. Comput. Aided Mol. Des..

[15]  Mitchell A. Miller Chemical database techniques in drug discovery , 2002, Nature Reviews Drug Discovery.

[16]  Malcolm J. McGregor,et al.  Pharmacophore Fingerprinting. 2. Application to Primary Library Design , 2000, J. Chem. Inf. Comput. Sci..

[17]  Dejan Plavšić,et al.  The distance matrix in chemistry , 1992 .

[18]  Thierry Langer,et al.  LigandScout: 3-D Pharmacophores Derived from Protein-Bound Ligands and Their Use as Virtual Screening Filters , 2005, J. Chem. Inf. Model..

[19]  Pierre Baldi,et al.  ChemDB update - full-text search and virtual chemical space , 2007, Bioinform..

[20]  Christopher W Murray,et al.  Fragment-based lead discovery using X-ray crystallography. , 2005, Journal of medicinal chemistry.

[21]  Mika A. Kastenholz,et al.  GRID/CPCA: a new computational tool to design selective ligands. , 2000, Journal of medicinal chemistry.

[22]  Marina Lasagni,et al.  New molecular descriptors for 2D and 3D structures. Theory , 1994 .

[23]  Felix Deanda,et al.  Application of the PharmPrint Methodology to Two Protein Kinases , 2004, J. Chem. Inf. Model..

[24]  David S. Wishart,et al.  DrugBank: a knowledgebase for drugs, drug actions and drug targets , 2007, Nucleic Acids Res..

[25]  Brian K. Shoichet,et al.  ZINC - A Free Database of Commercially Available Compounds for Virtual Screening , 2005, J. Chem. Inf. Model..

[26]  Kun-Yi Hsin,et al.  An improved strategy for the crystallization of Leishmania mexicana pyruvate kinase. , 2010, Acta crystallographica. Section F, Structural biology and crystallization communications.

[27]  Martin Serrano,et al.  Nucleic Acids Research Advance Access published October 18, 2007 ChemBank: a small-molecule screening and , 2007 .

[28]  Andrew R. Leach,et al.  A comparison of the pharmacophore identification programs: Catalyst, DISCO and GASP , 2002, J. Comput. Aided Mol. Des..