Indirect Similarity Based Methods for Effective Scaffold-Hopping in Chemical Compounds

Methods that can screen large databases to retrieve a structurally diverse set of compounds with desirable bioactivity properties are critical in the drug discovery and development process. This paper presents a set of such methods that are designed to find compounds that are structurally different to a certain query compound while retaining its bioactivity properties (scaffold hops). These methods utilize various indirect ways of measuring the similarity between the query and a compound that take into account additional information beyond their structure-based similarities. The set of techniques that are presented capture these indirect similarities using approaches based on analyzing the similarity network formed by the query and the database compounds. Experimental evaluation shows that most of these methods substantially outperform previously developed approaches both in terms of their ability to identify structurally diverse active compounds as well as active compounds in general.

[1]  Darren V. S. Green,et al.  Modelling Structure‐Activity Relationships , 2000 .

[2]  Qiang Zhang,et al.  Scaffold hopping through virtual screening using 2D and 3D similarity descriptors: ranking, voting, and consensus scoring. , 2006, Journal of medicinal chemistry.

[3]  Peter Willett,et al.  Similarity Searching in Files of Three-Dimensional Chemical Structures: Evaluation of the EVA Descriptor and Combination of Rankings Using Data Fusion , 1997, J. Chem. Inf. Comput. Sci..

[4]  Jérôme Hert,et al.  New Methods for Ligand-Based Virtual Screening: Use of Data Fusion and Machine Learning to Enhance the Effectiveness of Similarity Searching , 2006, J. Chem. Inf. Model..

[5]  John Bradshaw,et al.  Similarity Searching Using Reduced Graphs , 2003, J. Chem. Inf. Comput. Sci..

[6]  Piotr Indyk,et al.  Nearest Neighbors in High-Dimensional Spaces , 2004, Handbook of Discrete and Computational Geometry, 2nd Ed..

[7]  Yvonne C. Martin,et al.  Use of Structure-Activity Data To Compare Structure-Based Clustering Methods and Descriptors for Use in Compound Selection , 1996, J. Chem. Inf. Comput. Sci..

[8]  Ian A. Watson,et al.  ErG: 2D Pharmacophore Descriptions for Scaffold Hopping. , 2006 .

[9]  D. Rogers,et al.  Using Extended-Connectivity Fingerprints with Laplacian-Modified Bayesian Analysis in High-Throughput Screening Follow-Up , 2005, Journal of biomolecular screening.

[10]  G. Schneider,et al.  Virtual Screening for Bioactive Molecules , 2000 .

[11]  John M. Barnard,et al.  Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[12]  Peter Willett,et al.  Enhancing the Effectiveness of Virtual Screening by Fusing Nearest Neighbor Lists: A Comparison of Similarity Coefficients , 2004, J. Chem. Inf. Model..

[13]  Robert Krauthgamer,et al.  Navigating nets: simple algorithms for proximity search , 2004, SODA '04.

[14]  Peter Willett,et al.  Descriptor‐Based Similarity Measures for Screening Chemical Databases , 2000 .

[15]  David A. Cosgrove,et al.  Lead Hopping Using SVM and 3D Pharmacophore Fingerprints. , 2005 .

[16]  Hanan Samet,et al.  Foundations of multidimensional and metric data structures , 2006, Morgan Kaufmann series in data management systems.

[17]  George Karypis,et al.  Frequent Substructure-Based Approaches for Classifying Chemical Compounds , 2005, IEEE Trans. Knowl. Data Eng..

[18]  François Fouss,et al.  Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation , 2007, IEEE Transactions on Knowledge and Data Engineering.

[19]  M. Greenwood An Introduction to Medical Statistics , 1932, Nature.

[20]  Naomie Salim,et al.  Combination of Fingerprint-Based Similarity Coefficients Using Data Fusion , 2003, J. Chem. Inf. Comput. Sci..

[21]  Bernd Teufel,et al.  Full text retrieval based on syntactic similarities , 1988, Inf. Syst..

[22]  Satish Rao,et al.  A note on the nearest neighbor in growth-restricted metrics , 2004, SODA '04.

[23]  George Karypis,et al.  Comparison of descriptor spaces for chemical compound retrieval and classification , 2006, Sixth International Conference on Data Mining (ICDM'06).

[24]  Marti A. Hearst,et al.  Reexamining the cluster hypothesis: scatter/gather on retrieval results , 1996, SIGIR '96.

[25]  Gisbert Schneider,et al.  High‐Throughput Screening and Virtual Screening: Entry Points to Drug Discovery , 2000 .

[26]  Robert P. Sheridan,et al.  Chemical Similarity Using Physiochemical Property Descriptors , 1996, J. Chem. Inf. Comput. Sci..

[27]  Robert D. Clark,et al.  Structural Unit Analysis Identifies Lead Series and Facilitates Scaffold Hopping in Combinatorial Chemistry. , 2006 .

[28]  Darren V. S. Green,et al.  The Reduced Graph Descriptor in Virtual Screening and Data-Driven Clustering of High-Throughput Screening Data , 2005, J. Chem. Inf. Model..

[29]  Jonas Boström,et al.  Computational chemistry-driven decision making in lead generation. , 2006, Drug discovery today.

[30]  Y. Martin,et al.  Do structurally similar molecules have similar biological activity? , 2002, Journal of medicinal chemistry.

[31]  Michael K. Gilson,et al.  Virtual Screening of Molecular Databases Using a Support Vector Machine , 2005, J. Chem. Inf. Model..

[32]  P. Willett,et al.  Comparison of topological descriptors for similarity-based virtual screening using multiple bioactive reference structures. , 2004, Organic & biomolecular chemistry.

[33]  Nathan Brown,et al.  On scaffolds and hopping in medicinal chemistry. , 2006, Mini reviews in medicinal chemistry.