Shape‐based similarity searching in chemical databases

Shape similarity is a key concept and requirement for molecular recognition. As a result, much research has been undertaken to develop methods to represent molecular shape and to quantify the shape similarity between molecules. A great variety of shape descriptions and similarity comparison approaches have been developed, ranging from explicit representations using intersecting atom‐centered spheres and molecular superposition to abstract statistical representations that allow alignment‐free comparisons. Several of these methods have sufficient computational performance to allow shape similarity searches over extremely large compound databases as a ligand‐based virtual screening (VS) technique. As with other approaches to VS, the relative performance of shape similarity methods is dataset and problem specific, with each approach having its merits and limitations and no one approach showing a clear and consistent advantage over the others. Again, as with other VS approaches, most reports of performance are in the context of retrospective validation studies, which show competitive performance against target‐based and two‐dimensional methods. Prospective studies are rarer in the literature, but a number of successes have been reported. Intensive research continues in the search for improved representations of shape, partial shape matching, and approaches to address the challenges imposed by ligand and target flexibility. © 2012 John Wiley & Sons, Ltd.

[1]  Daniel Moser,et al.  Dual-target virtual screening by pharmacophore elucidation and molecular shape filtering. , 2012, ACS medicinal chemistry letters.

[2]  J. Irwin,et al.  Benchmarking sets for molecular docking. , 2006, Journal of medicinal chemistry.

[3]  Szymon Rusinkiewicz,et al.  Rotation Invariant Spherical Harmonic Representation of 3D Shape Descriptors , 2003, Symposium on Geometry Processing.

[4]  P. Dean,et al.  Molecular recognition: 3d surface structure comparison by gnomonic , 1987 .

[5]  Bin Li,et al.  Fast protein tertiary structure retrieval based on global surface shape similarity , 2008, Proteins.

[6]  W. Graham Richards,et al.  Improving the accuracy of ultrafast ligand-based screening: incorporating lipophilicity into ElectroShape as an extra dimension , 2011, J. Comput. Aided Mol. Des..

[7]  Vijay S. Pande,et al.  PAPER—Accelerating parallel evaluations of ROCS , 2010, J. Comput. Chem..

[8]  Remco C. Veltkamp,et al.  A survey of content based 3D shape retrieval methods , 2004, Proceedings Shape Modeling Applications, 2004..

[9]  Andreas Bender,et al.  Recognizing Pitfalls in Virtual Screening: A Critical Review , 2012, J. Chem. Inf. Model..

[10]  R. Cramer,et al.  Topomer CoMFA: a design methodology for rapid lead optimization. , 2003, Journal of medicinal chemistry.

[11]  David W. Ritchie,et al.  Using Consensus-Shape Clustering To Identify Promiscuous Ligands and Protein Targets and To Choose the Right Query for Shape-Based Virtual Screening , 2011, J. Chem. Inf. Model..

[12]  Daniel A. Keim,et al.  An experimental effectiveness comparison of methods for 3D similarity search , 2006, International Journal on Digital Libraries.

[13]  Robert P. Sheridan,et al.  Comparison of Topological, Shape, and Docking Methods in Virtual Screening , 2007, J. Chem. Inf. Model..

[14]  Francesca Perruccio,et al.  HPPD: Ligand- and Target-Based Virtual Screening on a Herbicide Target , 2010, J. Chem. Inf. Model..

[15]  Ajay N. Jain,et al.  Molecular Shape and Medicinal Chemistry: A Perspective , 2010, Journal of medicinal chemistry.

[16]  Sebastian G. Rohrer,et al.  Maximum Unbiased Validation (MUV) Data Sets for Virtual Screening Based on PubChem Bioactivity Data , 2009, J. Chem. Inf. Model..

[17]  J. A. Grant,et al.  A Gaussian Description of Molecular Shape , 1995 .

[18]  Weifan Zheng,et al.  Unconventional 2D Shape Similarity Method Affords Comparable Enrichment as a 3D Shape Method in Virtual Screening Experiments , 2009, J. Chem. Inf. Model..

[19]  Evan Bolton,et al.  PubChem3D: a new resource for scientists , 2011, J. Cheminformatics.

[20]  Nelson L. Max,et al.  Spherical harmonic molecular surfaces , 1988, IEEE Computer Graphics and Applications.

[21]  Anthony Nicholls,et al.  What do we know and when do we know it? , 2008, J. Comput. Aided Mol. Des..

[22]  S. Muchmore,et al.  The Use of Three‐Dimensional Shape and Electrostatic Similarity Searching in the Identification of a Melanin‐Concentrating Hormone Receptor 1 Antagonist , 2006, Chemical biology & drug design.

[23]  Robert D Clark,et al.  Bioisosterism as a molecular diversity descriptor: steric fields of single "topomeric" conformers. , 1996, Journal of medicinal chemistry.

[24]  David W. Ritchie,et al.  Fast computation, rotation, and comparison of low resolution spherical harmonic molecular surfaces , 1999, Journal of Computational Chemistry.

[25]  B. Masek,et al.  Molecular skins: A new concept for quantitative shape matching of a protein with its small molecule mimics , 1993, Proteins.

[26]  J. Baell,et al.  New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. , 2010, Journal of medicinal chemistry.

[27]  Christine Humblet,et al.  Computation of 3D queries for ROCS based virtual screens , 2009, J. Comput. Aided Mol. Des..

[28]  Christopher I. Bayly,et al.  Evaluating Virtual Screening Methods: Good and Bad Metrics for the "Early Recognition" Problem , 2007, J. Chem. Inf. Model..

[29]  A. Olson,et al.  Approximation and characterization of molecular surfaces , 1993, Biopolymers.

[30]  P. Dean,et al.  Molecular recognition: optimized searching through rotational 3-space for pattern matches on molecular surfaces , 1987 .

[31]  Alexander M. Lewis,et al.  Identification of a chemical probe for NAADP by virtual screening , 2009, Nature chemical biology.

[32]  W. G. Richards,et al.  Rapid evaluation of shape similarity using Gaussian functions , 1993, J. Chem. Inf. Comput. Sci..

[33]  Simona Distinto,et al.  How To Optimize Shape-Based Virtual Screening: Choosing the Right Query and Including Chemical Information , 2009, J. Chem. Inf. Model..

[34]  B. Masek,et al.  Molecular shape comparison of angiotensin II receptor antagonists. , 1993, Journal of medicinal chemistry.

[35]  Pedro J Ballester,et al.  Prospective virtual screening with Ultrafast Shape Recognition: the identification of novel inhibitors of arylamine N-acetyltransferases , 2010, Journal of The Royal Society Interface.

[36]  P Finn,et al.  Molecular surface comparison: application to drug design. , 1993, Journal of molecular graphics.

[37]  Mark S. Johnson,et al.  ShaEP: Molecular Overlay Based on Shape and Electrostatic Potential , 2009, J. Chem. Inf. Model..

[38]  W. Graham Richards,et al.  Ultrafast shape recognition to search compound databases for similar molecular shapes , 2007, J. Comput. Chem..

[39]  Woody Sherman,et al.  Rapid Shape-Based Ligand Alignment and Virtual Screening Method Based on Atom/Feature-Pair Similarities and Volume Overlap Scoring , 2011, J. Chem. Inf. Model..

[40]  Evan Bolton,et al.  PubChem3D: Diversity of shape , 2011, J. Cheminformatics.

[41]  Guillermo Moyna,et al.  Shape signatures: a new approach to computer-aided ligand- and receptor-based drug design. , 2003, Journal of medicinal chemistry.

[42]  Martin Thimm,et al.  Comparison of 2D Similarity and 3D Superposition. Application to Searching a Conformational Drug Database , 2004, J. Chem. Inf. Model..

[43]  J. A. Grant,et al.  A shape-based 3-D scaffold hopping method and its application to a bacterial protein-protein interaction. , 2005, Journal of medicinal chemistry.

[44]  C. Humblet,et al.  Escape from flatland: increasing saturation as an approach to improving clinical success. , 2009, Journal of medicinal chemistry.

[45]  Amedeo Caflisch,et al.  Complementing ultrafast shape recognition with an optical isomerism descriptor. , 2010, Journal of molecular graphics & modelling.

[46]  Michael L. Connolly,et al.  Computation of molecular volume , 1985 .

[47]  Young Do Kwon,et al.  Design, synthesis and biological evaluation of small molecule inhibitors of CD4-gp120 binding based on virtual screening. , 2011, Bioorganic & medicinal chemistry.

[48]  Bernard Chazelle,et al.  Matching 3D models with shape distributions , 2001, Proceedings International Conference on Shape Modeling and Applications.

[49]  Guixia Liu,et al.  Performance Evaluation of 2D Fingerprint and 3D Shape Similarity Methods in Virtual Screening , 2012, J. Chem. Inf. Model..

[50]  R. J. Brown,et al.  Melting Point and Molecular Symmetry , 2000 .

[51]  Brian K. Shoichet,et al.  ZINC - A Free Database of Commercially Available Compounds for Virtual Screening , 2005, J. Chem. Inf. Model..

[52]  Xi Chen,et al.  The Binding Database: data management and interface design , 2002, Bioinform..

[53]  D. Young,et al.  Are the Chemical Structures in Your QSAR Correct , 2008 .

[54]  Tudor I. Oprea,et al.  Optimization of CAMD techniques 3. Virtual screening enrichment studies: a help or hindrance in tool selection? , 2008, J. Comput. Aided Mol. Des..

[55]  Michael Nilges,et al.  Comparative Evaluation of 3D Virtual Ligand Screening Methods: Impact of the Molecular Alignment on Enrichment , 2010, J. Chem. Inf. Model..

[56]  Martin Serrano,et al.  Nucleic Acids Research Advance Access published October 18, 2007 ChemBank: a small-molecule screening and , 2007 .

[57]  Katsushi Ikeuchi Recognition of 3-D Objects Using the Extended Gaussian Image , 1981, IJCAI.

[58]  J. Bajorath,et al.  Quo vadis, virtual screening? A comprehensive survey of prospective applications. , 2010, Journal of medicinal chemistry.

[59]  Giuseppe Felice Mangiatordi,et al.  CoCoCo: a free suite of multiconformational chemical databases for high-throughput virtual screening purposes. , 2010, Molecular bioSystems.

[60]  Daisuke Kihara,et al.  Application of 3D Zernike descriptors to shape-based ligand similarity searching , 2009, J. Cheminformatics.

[61]  Philip M. Dean,et al.  Molecular surface-volume and property matching to superpose flexible dissimilar molecules , 1995, J. Comput. Aided Mol. Des..

[62]  Minoru Ishikawa,et al.  Improvement in aqueous solubility in small molecule drug discovery programs by disruption of molecular planarity and symmetry. , 2011, Journal of medicinal chemistry.

[63]  Yutaka Yamada,et al.  Virtual Screening for Ligands of the Insect Molting Hormone Receptor , 2011, J. Chem. Inf. Model..

[64]  Hitomi Yuki,et al.  Application of Support Vector Machine to Three-Dimensional Shape-Based Virtual Screening Using Comprehensive Three-Dimensional Molecular Shape Overlay with Known Inhibitors , 2012, J. Chem. Inf. Model..

[65]  Andy Vinter,et al.  Molecular Field Extrema as Descriptors of Biological Activity: Definition and Validation , 2006, J. Chem. Inf. Model..

[66]  D. Ritchie,et al.  Protein docking using spherical polar Fourier correlations , 2000, Proteins.

[67]  Raman Sharma,et al.  ElectroShape: fast molecular similarity calculations incorporating shape, chirality and electrostatics , 2010, J. Comput. Aided Mol. Des..

[68]  Charlotte M. Deane,et al.  Freely Available Conformer Generation Methods: How Good Are They? , 2012, J. Chem. Inf. Model..

[69]  Xiaofeng Liu,et al.  SHAFTS: A Hybrid Approach for 3D Molecular Similarity Calculation. 1. Method and Assessment of Virtual Screening , 2011, J. Chem. Inf. Model..

[70]  Yanli Wang,et al.  PubChem: a public information system for analyzing bioactivities of small molecules , 2009, Nucleic Acids Res..

[71]  Gustavo A. Arteca,et al.  A complete shape characterization for molecular charge densities represented by Gaussian‐type functions , 1991 .

[72]  Alexander Tropsha,et al.  Trust, But Verify: On the Importance of Chemical Structure Curation in Cheminformatics and QSAR Modeling Research , 2010, J. Chem. Inf. Model..

[73]  M. L. Connolly Solvent-accessible surfaces of proteins and nucleic acids. , 1983, Science.

[74]  Pan Xiang,et al.  Pose Insensitive 3D Retrieval by Poisson Shape Histogram , 2007, International Conference on Computational Science.

[75]  Hans-Peter Kriegel,et al.  Nearest Neighbor Classification in 3D Protein Databases , 1999, ISMB.

[76]  Christian Hofbauer,et al.  SURFCOMP: A Novel Graph-Based Approach to Molecular Surface Comparison , 2004, J. Chem. Inf. Model..

[77]  Garrett M. Morris,et al.  Molecular similarity including chirality. , 2009, Journal of molecular graphics & modelling.

[78]  Xin Wen,et al.  BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities , 2006, Nucleic Acids Res..

[79]  Lazaros Mavridis,et al.  Comprehensive Comparison of Ligand-Based Virtual Screening Tools Against the DUD Data set Reveals Limitations of Current 3D Methods , 2010, J. Chem. Inf. Model..

[80]  Petra Schneider,et al.  Spherical Harmonics Coefficients for Ligand-Based Virtual Screening of Cyclooxygenase Inhibitors , 2011, PloS one.

[81]  Lazaros Mavridis,et al.  Toward High Throughput 3D Virtual Screening Using Spherical Harmonic Surface Representations , 2007, J. Chem. Inf. Model..

[82]  Anne Mai Wassermann,et al.  REPROVIS-DB: A Benchmark System for Ligand-Based Virtual Screening Derived from Reproducible Prospective Applications , 2011, J. Chem. Inf. Model..

[83]  Andrew C. Good,et al.  Utilization of Gaussian functions for the rapid evaluation of molecular similarity , 1992, J. Chem. Inf. Comput. Sci..

[84]  Chang-Guo Zhan,et al.  Ligand-Based Virtual Screening Approach Using a New Scoring Function , 2012, J. Chem. Inf. Model..

[85]  Ajay N. Jain,et al.  Ligand-based structural hypotheses for virtual screening. , 2004, Journal of medicinal chemistry.