Clustering and Rule-Based Classifications of Chemical Structures Evaluated in the Biological Activity Space

Classification methods for data sets of molecules according to their chemical structure were evaluated for their biological relevance, including rule-based, scaffold-oriented classification methods and clustering based on molecular descriptors. Three data sets resulting from uniformly determined in vitro biological profiling experiments were classified according to their chemical structures, and the results were compared in a Pareto analysis with the number of classes and their average spread in the profile space as two concurrent objectives which were to be minimized. It has been found that no classification method is overall superior to all other studied methods, but there is a general trend that rule-based, scaffold-oriented methods are the better choice if classes with homogeneous biological activity are required, but a large number of clusters can be tolerated. On the other hand, clustering based on chemical fingerprints is superior if fewer and larger classes are required, and some loss of homogeneity in biological activity can be accepted.

[1]  A. Schuffenhauer,et al.  Charting biologically relevant chemical space: a structural classification of natural products (SCONP). , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Dragos Horvath,et al.  Molecular similarity and property similarity. , 2004, Current topics in medicinal chemistry.

[3]  Pierre Acklin,et al.  Similarity Metrics for Ligands Reflecting the Similarity of the Target Proteins , 2003, J. Chem. Inf. Comput. Sci..

[4]  Scott Boyer,et al.  Chemical and biological profiling of an annotated compound library directed to the nuclear receptor family. , 2005, Current topics in medicinal chemistry.

[5]  Gisbert Schneider,et al.  NIPALSTREE: A New Hierarchical Clustering Approach for Large Compound Libraries and Its Application to Virtual Screening , 2006, J. Chem. Inf. Model..

[6]  J. Jenkins,et al.  A 3D similarity method for scaffold hopping from known drugs or natural ligands to new chemotypes. , 2004, Journal of medicinal chemistry.

[7]  Peter Willett,et al.  Designing focused libraries using MoSELECT. , 2002, Journal of molecular graphics & modelling.

[8]  Gisbert Schneider,et al.  A Hierarchical Clustering Approach for Large Compound Libraries , 2005, J. Chem. Inf. Model..

[9]  Marvin Johnson,et al.  Concepts and applications of molecular similarity , 1990 .

[10]  Christos A. Nicolaou,et al.  Ties in Proximity and Clustering Compounds , 2001, J. Chem. Inf. Comput. Sci..

[11]  Johann Gasteiger,et al.  Neural networks in chemistry and drug design , 1999 .

[12]  Jeffrey J. Sutherland,et al.  Development of Quantitative Structure-Activity Relationships and Classification Models for Anticonvulsant Activity of Hydantoin Analogues , 2003, J. Chem. Inf. Comput. Sci..

[13]  Robert D. Clark,et al.  OptiSim: An Extended Dissimilarity Selection Method for Finding Diverse Representative Subsets , 1997, J. Chem. Inf. Comput. Sci..

[14]  P. Furet,et al.  Strategies toward the design of novel and selective protein tyrosine kinase inhibitors. , 1999, Pharmacology & therapeutics.

[15]  Y. Cheng,et al.  Relationship between the inhibition constant (K1) and the concentration of inhibitor which causes 50 per cent inhibition (I50) of an enzymatic reaction. , 1973, Biochemical pharmacology.

[16]  Peter Ertl,et al.  Applications of Self-Organizing Neural Networks in Virtual Screening and Diversity Selection , 2006, J. Chem. Inf. Model..

[17]  Johann Gasteiger,et al.  The Search for the Spatial and Electronic Requirements of a Drug , 2000 .

[18]  Nathan Brown,et al.  On scaffolds and hopping in medicinal chemistry. , 2006, Mini reviews in medicinal chemistry.

[19]  J. Gasteiger,et al.  Automatic generation of 3D-atomic coordinates for organic molecules , 1990 .

[20]  A. Fliri,et al.  Biological spectra analysis: Linking biological activity profiles to molecular structure. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Y. Martin,et al.  Do structurally similar molecules have similar biological activity? , 2002, Journal of medicinal chemistry.

[22]  A. Ghose,et al.  Prediction of Hydrophobic (Lipophilic) Properties of Small Organic Molecules Using Fragmental Methods: An Analysis of ALOGP and CLOGP Methods , 1998 .

[23]  James G. Nourse,et al.  Reoptimization of MDL Keys for Use in Drug Discovery , 2002, J. Chem. Inf. Comput. Sci..

[24]  P Willett,et al.  Binning schemes for partition-based compound selection. , 1999, Journal of molecular graphics & modelling.

[25]  Mark Johnson,et al.  Using Molecular Equivalence Numbers To Visually Explore Structural Features that Distinguish Chemical Libraries , 2002, J. Chem. Inf. Comput. Sci..

[26]  Dragos Horvath,et al.  Predicting ADME properties and side effects: the BioPrint approach. , 2003, Current opinion in drug discovery & development.

[27]  H. Halberstam,et al.  Combinatorial Analysis , 1971, Nature.

[28]  John M. Barnard,et al.  Clustering Methods and Their Uses in Computational Chemistry , 2003 .

[29]  P. Selzer,et al.  Fast calculation of molecular polar surface area as a sum of fragment-based contributions and its application to the prediction of drug transport properties. , 2000, Journal of medicinal chemistry.

[30]  L. Kelley,et al.  An automated approach for clustering an ensemble of NMR-derived protein structures into conformationally related subfamilies. , 1996, Protein engineering.

[31]  P. Willett,et al.  Comparison of topological descriptors for similarity-based virtual screening using multiple bioactive reference structures. , 2004, Organic & biomolecular chemistry.

[32]  G. Bemis,et al.  The properties of known drugs. 1. Molecular frameworks. , 1996, Journal of medicinal chemistry.

[33]  Peter Ertl,et al.  Relationships between Molecular Complexity, Biological Activity, and Structural Diversity , 2006, J. Chem. Inf. Model..

[34]  Stefan Wetzel,et al.  The Scaffold Tree - Visualization of the Scaffold Universe by Hierarchical Scaffold Classification , 2007, J. Chem. Inf. Model..

[35]  G. Schneider,et al.  Scaffold‐Hopping Potential of Ligand‐Based Similarity Concepts , 2006, ChemMedChem.