Stereoselective virtual screening of the ZINC database using atom pair 3D-fingerprints

AbstractBackgroundTools to explore large compound databases in search for analogs of query molecules provide a strategically important support in drug discovery to help identify available analogs of any given reference or hit compound by ligand based virtual screening (LBVS). We recently showed that large databases can be formatted for very fast searching with various 2D-fingerprints using the city-block distance as similarity measure, in particular a 2D-atom pair fingerprint (APfp) and the related category extended atom pair fingerprint (Xfp) which efficiently encode molecular shape and pharmacophores, but do not perceive stereochemistry. Here we investigated related 3D-atom pair fingerprints to enable rapid stereoselective searches in the ZINC database (23.2 million 3D structures).ResultsMolecular fingerprints counting atom pairs at increasing through-space distance intervals were designed using either all atoms (16-bit 3DAPfp) or different atom categories (80-bit 3DXfp). These 3D-fingerprints retrieved molecular shape and pharmacophore analogs (defined by OpenEye ROCS scoring functions) of 110,000 compounds from the Cambridge Structural Database with equal or better accuracy than the 2D-fingerprints APfp and Xfp, and showed comparable performance in recovering actives from decoys in the DUD database. LBVS by 3DXfp or 3DAPfp similarity was stereoselective and gave very different analogs when starting from different diastereomers of the same chiral drug. Results were also different from LBVS with the parent 2D-fingerprints Xfp or APfp. 3D- and 2D-fingerprints also gave very different results in LBVS of folded molecules where through-space distances between atom pairs are much shorter than topological distances.Conclusions3DAPfp and 3DXfp are suitable for stereoselective searches for shape and pharmacophore analogs of query molecules in large databases. Web-browsers for searching ZINC by 3DAPfp and 3DXfp similarity are accessible at www.gdb.unibe.ch and should provide useful assistance to drug discovery projects. Graphical abstractAtom pair fingerprints based on through-space distances (3DAPfp) provide better shape encoding than atom pair fingerprints based on topological distances (APfp) as measured by the recovery of ROCS shape analogs by fp similarity.

[1]  Jürgen Bajorath,et al.  Integration of virtual and high-throughput screening , 2002, Nature Reviews Drug Discovery.

[2]  Milan Randic,et al.  Novel Shape Descriptors for Molecular Graphs , 2001, J. Chem. Inf. Comput. Sci..

[3]  Lazaros Mavridis,et al.  Toward High Throughput 3D Virtual Screening Using Spherical Harmonic Surface Representations , 2007, J. Chem. Inf. Model..

[4]  Bernd Jagla,et al.  Repressor activity of the RpoS/σS-dependent RNA polymerase requires DNA binding , 2015, Nucleic acids research.

[5]  Weifan Zheng,et al.  Unconventional 2D Shape Similarity Method Affords Comparable Enrichment as a 3D Shape Method in Virtual Screening Experiments , 2009, J. Chem. Inf. Model..

[6]  Chris G. Kruse,et al.  Assessment of scaffold hopping efficiency by use of molecular interaction fingerprints. , 2008, Journal of medicinal chemistry.

[7]  J. Irwin,et al.  Benchmarking sets for molecular docking. , 2006, Journal of medicinal chemistry.

[8]  M. Congreve,et al.  A 'rule of three' for fragment-based lead discovery? , 2003, Drug discovery today.

[9]  C. Humblet,et al.  Escape from flatland: increasing saturation as an approach to improving clinical success. , 2009, Journal of medicinal chemistry.

[10]  Peter Willett,et al.  Similarity-based virtual screening using 2D fingerprints. , 2006, Drug discovery today.

[11]  M. Hann Molecular obesity, potency and other addictions in drug discovery , 2011 .

[12]  Ryan G. Coleman,et al.  ZINC: A Free Tool to Discover Chemistry for Biology , 2012, J. Chem. Inf. Model..

[13]  Wolfgang H. B. Sauer,et al.  Molecular Shape Diversity of Combinatorial Libraries: A Prerequisite for Broad Bioactivity , 2003, J. Chem. Inf. Comput. Sci..

[14]  Florian Nigsch,et al.  Recent trends and observations in the design of high-quality screening collections. , 2011, Future medicinal chemistry.

[15]  Jean-Louis Reymond,et al.  Exploring the chemical space of known and unknown organic small molecules at www.gdb.unibe.ch. , 2011, Chimia.

[16]  J. A. Grant,et al.  A shape-based 3-D scaffold hopping method and its application to a bacterial protein-protein interaction. , 2005, Journal of medicinal chemistry.

[17]  Jean-Louis Reymond,et al.  SMIfp (SMILES fingerprint) Chemical Space for Virtual Screening and Visualization of Large Databases of Organic Molecules , 2013, J. Chem. Inf. Model..

[18]  J. Gasteiger,et al.  FROM ATOMS AND BONDS TO THREE-DIMENSIONAL ATOMIC COORDINATES : AUTOMATIC MODEL BUILDERS , 1993 .

[19]  Robert P. Sheridan,et al.  Comparison of Topological, Shape, and Docking Methods in Virtual Screening. , 2007 .

[20]  Andreas Bender,et al.  How Diverse Are Diversity Assessment Methods? A Comparative Analysis and Benchmarking of Molecular Descriptor Space , 2014, J. Chem. Inf. Model..

[21]  David W. Ritchie,et al.  Using Consensus-Shape Clustering To Identify Promiscuous Ligands and Protein Targets and To Choose the Right Query for Shape-Based Virtual Screening , 2011, J. Chem. Inf. Model..

[22]  Ajay N. Jain,et al.  Molecular Shape and Medicinal Chemistry: A Perspective , 2010, Journal of medicinal chemistry.

[23]  F. Lombardo,et al.  Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. , 2001, Advanced drug delivery reviews.

[24]  Gábor Imre,et al.  Screen3D: A Novel Fully Flexible High-Throughput Shape-Similarity Search Method , 2014, J. Chem. Inf. Model..

[25]  Thomas R. Hagadone,et al.  Molecular Substructure Similarity Searching: Efficient Retrieval in Two-Dimensional Structure Databases. , 1993 .

[26]  Jean-Louis Reymond,et al.  A multi-fingerprint browser for the ZINC database , 2014, Nucleic Acids Res..

[27]  Jean-Louis Reymond,et al.  Visualisation and subsets of the chemical universe database GDB-13 for virtual screening , 2011, J. Comput. Aided Mol. Des..

[28]  Hans-Joachim Böhm,et al.  A guide to drug discovery: Hit and lead generation: beyond high-throughput screening , 2003, Nature Reviews Drug Discovery.

[29]  Lorenz C. Blum,et al.  Classification of Organic Molecules by Molecular Quantum Numbers , 2009, ChemMedChem.

[30]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[31]  Anthony Nicholls,et al.  Conformer Generation with OMEGA: Learning from the Data Set and the Analysis of Failures , 2012, J. Chem. Inf. Model..

[32]  Tom L. Blundell,et al.  USRCAT: real-time ultrafast shape recognition with pharmacophoric constraints , 2012, Journal of Cheminformatics.

[33]  Jerry O Ebalunode,et al.  Molecular shape technologies in drug discovery: methods and applications. , 2010, Current topics in medicinal chemistry.

[34]  H. Matter,et al.  Selecting optimally diverse compounds from structure databases: a validation study of two-dimensional and three-dimensional molecular descriptors. , 1997, Journal of medicinal chemistry.

[35]  A. Hopkins,et al.  The role of ligand efficiency metrics in drug discovery , 2014, Nature Reviews Drug Discovery.

[36]  Jean-Louis Reymond,et al.  Visualization and Virtual Screening of the Chemical Universe Database GDB-17 , 2013, J. Chem. Inf. Model..

[37]  Evan Bolton,et al.  PubChem3D: conformer ensemble accuracy , 2013, Journal of Cheminformatics.

[38]  P. Hawkins,et al.  Comparison of shape-matching and docking as virtual screening tools. , 2007, Journal of medicinal chemistry.

[39]  Simona Distinto,et al.  How To Optimize Shape-Based Virtual Screening: Choosing the Right Query and Including Chemical Information , 2009, J. Chem. Inf. Model..

[40]  W. Graham Richards,et al.  Ultrafast shape recognition to search compound databases for similar molecular shapes , 2007, J. Comput. Chem..

[41]  Matthias Rarey,et al.  Protein pocket and ligand shape comparison and its application in virtual screening , 2013, Journal of Computer-Aided Molecular Design.

[42]  Tudor I. Oprea,et al.  The Design of Leadlike Combinatorial Libraries. , 1999, Angewandte Chemie.

[43]  J. Andrew Grant,et al.  Small Molecule Shape-Fingerprints , 2005, J. Chem. Inf. Model..

[44]  Robert P. Sheridan,et al.  Chemical Similarity Using Geometric Atom Pair Descriptors , 1996, J. Chem. Inf. Comput. Sci..

[45]  Jean-Louis Reymond,et al.  Atom Pair 2D-Fingerprints Perceive 3D-Molecular Shape and Pharmacophores for Very Fast Virtual Screening of ZINC and GDB-17 , 2014, J. Chem. Inf. Model..

[46]  Yvonne C. Martin,et al.  The Information Content of 2D and 3D Structural Descriptors Relevant to Ligand-Receptor Binding , 1997, J. Chem. Inf. Comput. Sci..

[47]  Maciej Haranczyk,et al.  Comparison of Nonbinary Similarity Coefficients for Similarity Searching, Clustering and Compound Selection , 2009, J. Chem. Inf. Model..

[48]  R. Venkataraghavan,et al.  Atom pairs as molecular features in structure-activity studies: definition and applications , 1985, J. Chem. Inf. Comput. Sci..

[49]  Jean-Louis Reymond,et al.  A Searchable Map of PubChem , 2010, J. Chem. Inf. Model..

[50]  Nathan Brown,et al.  Plane of Best Fit: A Novel Method to Characterize the Three-Dimensionality of Molecules , 2012, J. Chem. Inf. Model..

[51]  Qiang Zhang,et al.  Scaffold hopping through virtual screening using 2D and 3D similarity descriptors: ranking, voting, and consensus scoring. , 2006, Journal of medicinal chemistry.

[52]  Guixia Liu,et al.  Performance Evaluation of 2D Fingerprint and 3D Shape Similarity Methods in Virtual Screening , 2012, J. Chem. Inf. Model..

[53]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[54]  Schmid,et al.  "Scaffold-Hopping" by Topological Pharmacophore Search: A Contribution to Virtual Screening. , 1999, Angewandte Chemie.