Pharmacophore alignment search tool: Influence of the third dimension on text‐based similarity searching

Previously (Hähnke et al., J Comput Chem 2010, 31, 2810) we introduced the concept of nonlinear dimensionality reduction for canonization of two‐dimensional layouts of molecular graphs as foundation for text‐based similarity searching using our Pharmacophore Alignment Search Tool (PhAST), a ligand‐based virtual screening method. Here we apply these methods to three‐dimensional molecular conformations and investigate the impact of these additional degrees of freedom on virtual screening performance and assess differences in ranking behavior. Best‐performing variants of PhAST are compared with 16 state‐of‐the‐art screening methods with respect to significance estimates for differences in screening performance. We show that PhAST sorts new chemotypes on early ranks without sacrificing overall screening performance. We succeeded in combining PhAST with other virtual screening techniques by rank‐based data fusion, significantly improving screening capabilities. We also present a parameterization of double dynamic programming for the problem of small molecule comparison, which allows for the calculation of structural similarity between compounds based on one‐dimensional representations, opening the door to a holistic approach to molecule comparison based on textual representations. © 2011 Wiley Periodicals, Inc. J Comput Chem , 2011.

[1]  Hans Matter,et al.  Comparing 3D Pharmacophore Triplets and 2D Fingerprints for Selecting Diverse Compound Subsets , 1999, J. Chem. Inf. Comput. Sci..

[2]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[3]  Sung-Hou Kim,et al.  Preparation, characterization, and the crystal structure of the inhibitor ZK-807834 (CI-1031) complexed with factor Xa. , 2000, Biochemistry.

[4]  Lawrence J Marnett,et al.  A Novel Mechanism of Cyclooxygenase-2 Inhibition Involving Interactions with Ser-530 and Tyr-385* , 2003, Journal of Biological Chemistry.

[5]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[6]  G. Peano Sur une courbe, qui remplit toute une aire plane , 1890 .

[7]  Robert P. Sheridan,et al.  Comparison of Topological, Shape, and Docking Methods in Virtual Screening , 2007, J. Chem. Inf. Model..

[8]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[9]  David Weininger,et al.  SMILES. 2. Algorithm for generation of unique SMILES notation , 1989, J. Chem. Inf. Comput. Sci..

[10]  Pierre Baldi,et al.  A CROC stronger than ROC: measuring, visualizing and optimizing early retrieval , 2010, Bioinform..

[11]  Wei Zhao,et al.  A statistical framework to evaluate virtual screening , 2009, BMC Bioinformatics.

[12]  David Vidal,et al.  LINGO, an Efficient Holographic Text Based Method To Calculate Biophysical Properties and Intermolecular Similarities , 2005, J. Chem. Inf. Model..

[13]  James G. Nourse,et al.  Reoptimization of MDL Keys for Use in Drug Discovery , 2002, J. Chem. Inf. Comput. Sci..

[14]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[15]  R. Kurumbail,et al.  Structural basis for selective inhibition of cyclooxygenase-2 by anti-inflammatory agents , 1996, Nature.

[16]  Gisbert Schneider,et al.  Collection of bioactive reference compounds for focused library design , 2003 .

[17]  H. R. Evans,et al.  Structural details on the binding of antihypertensive drugs captopril and enalaprilat to human testicular angiotensin I-converting enzyme. , 2004, Biochemistry.

[18]  H. V. Koch Une méthode géométrique élémentaire pour l’étude de certaines questions de la théorie des courbes planes , 1906 .

[19]  Tudor I. Oprea,et al.  Optimization of CAMD techniques 3. Virtual screening enrichment studies: a help or hindrance in tool selection? , 2008, J. Comput. Aided Mol. Des..

[20]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[21]  G. Bemis,et al.  The properties of known drugs. 1. Molecular frameworks. , 1996, Journal of medicinal chemistry.

[22]  J. Irwin,et al.  Benchmarking sets for molecular docking. , 2006, Journal of medicinal chemistry.

[23]  G. Schneider,et al.  Automated Docking of Flexible Molecules Into Receptor Binding Sites by Ligand Self‐Organization In Situ , 2010, Molecular informatics.

[24]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[25]  Millard H. Lambert,et al.  Asymmetry in the PPARγ/RXRα Crystal Structure Reveals the Molecular Basis of Heterodimerization among Nuclear Receptors , 2000 .

[26]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[27]  Fred J. Damerau,et al.  A technique for computer detection and correction of spelling errors , 1964, CACM.

[28]  Sebastian G. Rohrer,et al.  Maximum Unbiased Validation (MUV) Data Sets for Virtual Screening Based on PubChem Bioactivity Data , 2009, J. Chem. Inf. Model..

[29]  K. Pearson Mathematical Contributions to the Theory of Evolution. III. Regression, Heredity, and Panmixia , 1896 .

[30]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[31]  R. Natesh,et al.  Crystal structure of the human angiotensin-converting enzyme–lisinopril complex , 2003, Nature.

[32]  Krishnan Balasubramanian,et al.  A Simple Algorithm for Unique Representation of Chemical Structures-Cyclic/Acyclic Functionalized Achiral Molecules , 2006, J. Chem. Inf. Model..

[33]  Gisbert Schneider,et al.  PhAST: pharmacophore alignment search tool , 2009, J. Comput. Chem..

[34]  T. Speed,et al.  Biological Sequence Analysis , 1998 .

[35]  David Vidal,et al.  A Novel Search Engine for Virtual Screening of Very Large Databases , 2006, J. Chem. Inf. Model..

[36]  Andreas Zell,et al.  Optimal assignment methods for ligand-based virtual screening , 2009, J. Cheminformatics.

[37]  Friedrich Rippmann,et al.  Pharmacophore alignment search tool: Influence of canonical atom labeling on similarity searching , 2010, J. Comput. Chem..

[38]  A. Spada,et al.  Crystal structures of human factor Xa complexed with potent inhibitors. , 2000, Journal of medicinal chemistry.

[39]  Petra Schneider,et al.  Scaffold Hopping by “Fuzzy” Pharmacophores and its Application to RNA Targets , 2007, Chembiochem : a European journal of chemical biology.

[40]  Yvonne Perrie,et al.  Adjuvant properties of a simplified C32 monomycolyl glycerol analogue. , 2009, Bioorganic & medicinal chemistry letters.

[41]  D. Hilbert Ueber die stetige Abbildung einer Line auf ein Flächenstück , 1891 .

[42]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[43]  Dimitris K. Agrafiotis,et al.  Stochastic proximity embedding , 2003, J. Comput. Chem..

[44]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[45]  Johann Gasteiger,et al.  Canonical Numbering and Constitutional Symmetry , 1977, J. Chem. Inf. Comput. Sci..

[46]  Christopher I. Bayly,et al.  Evaluating Virtual Screening Methods: Good and Bad Metrics for the "Early Recognition" Problem , 2007, J. Chem. Inf. Model..

[47]  G. Bemis,et al.  Properties of known drugs. 2. Side chains. , 1999, Journal of medicinal chemistry.

[48]  Joseph R. Luft,et al.  Comparison of ternary crystal complexes of F31 variants of human dihydrofolate reductase with NADPH and a classical antitumor furopyrimidine. , 1998 .

[49]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[50]  C. Orengo,et al.  A rapid method of protein structure alignment. , 1990, Journal of theoretical biology.

[51]  W G Hol,et al.  Three-dimensional structure of M. tuberculosis dihydrofolate reductase reveals opportunities for the design of novel tuberculosis drugs. , 2000, Journal of molecular biology.

[52]  G. Schneider,et al.  Homology Model Adjustment and Ligand Screening with a Pseudoreceptor of the Human Histamine H4 Receptor , 2009, ChemMedChem.

[53]  Schmid,et al.  "Scaffold-Hopping" by Topological Pharmacophore Search: A Contribution to Virtual Screening. , 1999, Angewandte Chemie.

[54]  Yong Li,et al.  Structural and biochemical basis for selective repression of the orphan nuclear receptor liver receptor homolog 1 by small heterodimer partner. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[55]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[56]  W R Taylor,et al.  A holistic approach to protein structure alignment. , 1989, Protein engineering.

[57]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[58]  William Stafford Noble,et al.  Learning kernels from biological networks by maximizing entropy , 2004, ISMB/ECCB.

[59]  Pekka Tiikkainen,et al.  Critical Comparison of Virtual Screening Methods against the MUV Data Set , 2009, J. Chem. Inf. Model..

[60]  Werner Seitz,et al.  D-Phe-Pro-Arg type thrombin inhibitors: unexpected selectivity by modification of the P1 moiety. , 2003, Bioorganic & medicinal chemistry letters.

[61]  W. C. Still,et al.  The multiple minimum problem in molecular modeling. Tree searching internal coordinate conformational space , 1988 .

[62]  Pedro J. Ballester,et al.  Ultrafast shape recognition for similarity search in molecular databases , 2007, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.