Background Theory of Molecular Diversity

Recent developments in the technologies of HTS and combinatorial chemistry have thrown down a challenge to computational chemistry, that of maximising the chemical diversity of the compounds made and screened. This paper examines the theory behind molecular diversity analysis and includes a discussion of most of the common diversity indices, and intermolecular similarity and dissimilarity measures. The extent to which the different approaches to diversity analysis have been validated and compared is reviewed. The effects of designing diverse libraries by analysing product and reagent space are presented, and the issues surrounding the comparison of libraries and databases in diversity space are discussed.

[1]  David Chapman,et al.  The measurement of molecular diversity: A three-dimensional approach , 1996, J. Comput. Aided Mol. Des..

[2]  Darren V. S. Green,et al.  Selecting Combinatorial Libraries to Optimize Diversity and Physical Properties , 1999, J. Chem. Inf. Comput. Sci..

[3]  John M. Barnard,et al.  Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[4]  Thomas R. Hagadone,et al.  Molecular substructure similarity searching: efficient retrieval in two-dimensional structure databases , 1992, J. Chem. Inf. Comput. Sci..

[5]  Robert P. Sheridan,et al.  Using a Genetic Algorithm To Suggest Combinatorial Libraries , 1995, J. Chem. Inf. Comput. Sci..

[6]  Roderick E. Hubbard,et al.  Characterising the geometric diversity of functional groups in chemical databases , 1995, J. Comput. Aided Mol. Des..

[7]  Brian D. Hudson,et al.  Parameter Based Methods for Compound Selection from Chemical Databases , 1996 .

[8]  Yvonne C. Martin,et al.  The Information Content of 2D and 3D Structural Descriptors Relevant to Ligand-Receptor Binding , 1997, J. Chem. Inf. Comput. Sci..

[9]  J. Chlachula Geology and quaternary environments of the first preglacial palaeolithic sites found in Alberta, Canada , 1996 .

[10]  Dimitris K. Agrafiotis,et al.  Stochastic Algorithms for Maximizing Molecular Diversity , 1997, J. Chem. Inf. Comput. Sci..

[11]  Robert P. Sheridan,et al.  3DSEARCH: a system for three-dimensional substructure searching , 1989, J. Chem. Inf. Comput. Sci..

[12]  P. Willett,et al.  A Comparison of Some Measures for the Determination of Inter‐Molecular Structural Similarity Measures of Inter‐Molecular Structural Similarity , 1986 .

[13]  M. Lajiness Dissimilarity-based compound selection techniques , 1996 .

[14]  Sung Jin Cho,et al.  Rational Combinatorial Library Design. 1. Focus-2D: A New Approach to the Design of Targeted Combinatorial Chemical Libraries , 1998, J. Chem. Inf. Comput. Sci..

[15]  John M. Barnard,et al.  Techniques for Generating Descriptive Fingerprints in Combinatorial Libraries , 1997, J. Chem. Inf. Comput. Sci..

[16]  Jonathan A. Ellman,et al.  Design, Synthesis, and Evaluation of Small-Molecule Libraries , 1996 .

[17]  A. Good,et al.  New methodology for profiling combinatorial libraries and screening sets: cleaning up the design process with HARPick. , 1997, Journal of medicinal chemistry.

[18]  P Willett,et al.  Comparison of algorithms for dissimilarity-based compound selection. , 1997, Journal of molecular graphics & modelling.

[19]  Peter Willett,et al.  Rapid Quantification of Molecular Diversity for Selective Database Acquisition , 1997, J. Chem. Inf. Comput. Sci..

[20]  David J. Cummins,et al.  Molecular Diversity in Chemical Databases: Comparison of Medicinal Chemistry Knowledge Bases and Databases of Commercially Available Compounds , 1996, J. Chem. Inf. Comput. Sci..

[21]  David E. Clark,et al.  PRO_SELECT: Combining structure-based drug design and combinatorial chemistry for rapid lead discovery. 1. Technology , 1997, J. Comput. Aided Mol. Des..

[22]  Ramaswamy Nilakantan,et al.  Database diversity assessment: New ideas, concepts, and tools , 1997, J. Comput. Aided Mol. Des..

[23]  Y. Martin,et al.  Designing combinatorial library mixtures using a genetic algorithm. , 1997, Journal of medicinal chemistry.

[24]  K. M. Smith,et al.  Novel software tools for chemical diversity , 1998 .

[25]  Andreas Zell,et al.  Locating Biologically Active Compounds in Medium-Sized Heterogeneous Datasets by Topological Autocorrelation Vectors: Dopamine and Benzodiazepine Agonists , 1996, J. Chem. Inf. Comput. Sci..

[26]  Hualiang Jiang,et al.  A New Approach to Design Virtual Combinatorial Library with Genetic Algorithm Based on 3D Grid Property , 1998, J. Chem. Inf. Comput. Sci..

[27]  John Bradshaw,et al.  The Effectiveness of Reactant Pools for Generating Structurally-Diverse Combinatorial Libraries , 1997, J. Chem. Inf. Comput. Sci..

[28]  H. Kubinyi,et al.  A scoring scheme for discriminating between drugs and nondrugs. , 1998, Journal of medicinal chemistry.

[29]  John Bradshaw,et al.  Identification of Biological Activity Profiles Using Substructural Analysis and Genetic Algorithms , 1998, J. Chem. Inf. Comput. Sci..

[30]  Yvonne C. Martin,et al.  Use of Structure-Activity Data To Compare Structure-Based Clustering Methods and Descriptors for Use in Compound Selection , 1996, J. Chem. Inf. Comput. Sci..

[31]  Gareth Jones,et al.  Further Development of a Genetic Algorithm for Ligand Docking and Its Application to Screening Combinatorial Libraries , 1999 .

[32]  Iain M. McLay,et al.  Similarity Measures for Rational Set Selection and Analysis of Combinatorial Libraries: The Diverse Property-Derived (DPD) Approach , 1997, Journal of chemical information and computer sciences.

[33]  Ian A. Watson,et al.  Experimental Designs for Selecting Molecules from Large Chemical Databases , 1997, J. Chem. Inf. Comput. Sci..

[34]  H Matter,et al.  Random or rational design? Evaluation of diverse compound subsets from chemical structure databases. , 1998, Journal of medicinal chemistry.

[35]  Stephen D. Pickett,et al.  Partition-based selection , 1996 .

[36]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory, Third Edition , 1989, Springer Series in Information Sciences.

[37]  Peter Willett,et al.  Definitions of "Dissimilarity" for Dissimilarity-Based Compound Selection , 1996 .

[38]  Peter Willett,et al.  Similarity Searching in Files of Three-Dimensional Chemical Structures. Alignment of Molecular Electrostatic Potential Fields with a Genetic Algorithm , 1996, J. Chem. Inf. Comput. Sci..

[39]  Johann Gasteiger,et al.  Assessing Similarity and Diversity of Combinatorial Libraries by Spatial Autocorrelation Functions and Neural Networks , 1996 .

[40]  R. Cramer,et al.  Comparative molecular field analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins. , 1988, Journal of the American Chemical Society.

[41]  Wendy A. Warr,et al.  Combinatorial Chemistry and Molecular Diversity. An Overview , 1997, J. Chem. Inf. Comput. Sci..

[42]  Robert D. Brown Descriptors for diversity analysis , 1996 .

[43]  P. Schleyer Encyclopedia of computational chemistry , 1998 .

[44]  Hugo O. Villar,et al.  Exhaustive enumeration of molecular substructures , 1997 .

[45]  Robin Taylor,et al.  Simulation Analysis of Experimental Design Strategies for Screening Random Compounds as Potential New Drugs and Agrochemicals , 1995, J. Chem. Inf. Comput. Sci..

[46]  Cheng Cheng,et al.  Four Association Coefficients for Relating Molecular Similarity Measures , 1996, J. Chem. Inf. Comput. Sci..

[47]  Robert D Clark,et al.  Neighborhood behavior: a useful concept for validation of "molecular diversity" descriptors. , 1996, Journal of medicinal chemistry.

[48]  Robin W. Spencer Diversity Analysis in High Throughput Screening , 1997 .

[49]  Lori B. Pfahler,et al.  Lead Discovery Using Stochastic Cluster Analysis (SCA): A New Method for Clustering Structurally Similar Compounds , 1998, J. Chem. Inf. Comput. Sci..

[50]  Marvin Johnson,et al.  Concepts and applications of molecular similarity , 1990 .

[51]  James B. Dunbar,et al.  Enhancing the diversity of a corporate database using chemical database clustering and analysis , 1995, J. Comput. Aided Mol. Des..

[52]  Stephen D. Pickett,et al.  Diversity Profiling and Design Using 3D Pharmacophores: Pharmacophore-Derived Queries (PDQ) , 1996, J. Chem. Inf. Comput. Sci..

[53]  Abdelazize Laoui,et al.  DIVSEL and COMPLIB - Strategies for the Design and Comparison of Combinatorial Libraries using Pharmacophoric Descriptors , 1998, J. Chem. Inf. Comput. Sci..

[54]  Elizabeth Rahr,et al.  The Use of Procrustes Analysis to Compare Different Property Sets for the Characterization of a Diverse Set of Compounds , 1994 .

[55]  Marina Lasagni,et al.  New molecular descriptors for 2D and 3D structures. Theory , 1994 .

[56]  Malcolm J. McGregor,et al.  Clustering of Large Databases of Compounds: Using the MDL "Keys" as Structural Descriptors , 1997, J. Chem. Inf. Comput. Sci..

[57]  Ajay,et al.  Can we learn to distinguish between "drug-like" and "nondrug-like" molecules? , 1998, Journal of medicinal chemistry.

[58]  David M. Rocke,et al.  Predicting ligand binding to proteins by affinity fingerprinting. , 1995, Chemistry & biology.

[59]  John M. Barnard,et al.  Substructure searching methods: Old and new , 1993, J. Chem. Inf. Comput. Sci..

[60]  P. Willett,et al.  A Fast Algorithm For Selecting Sets Of Dissimilar Molecules From Large Chemical Databases , 1995 .

[61]  H. Matter,et al.  Selecting optimally diverse compounds from structure databases: a validation study of two-dimensional and three-dimensional molecular descriptors. , 1997, Journal of medicinal chemistry.

[62]  Robert P. Sheridan,et al.  Chemical Similarity Using Physiochemical Property Descriptors , 1996, J. Chem. Inf. Comput. Sci..

[63]  I. Kuntz,et al.  Molecular similarity based on DOCK-generated fingerprints. , 1996, Journal of medicinal chemistry.

[64]  D C Spellmeyer,et al.  Measuring diversity: experimental design of combinatorial libraries for drug discovery. , 1995, Journal of medicinal chemistry.

[65]  Peter Willett,et al.  Similarity Searching and Clustering of Chemical-Structure Databases Using Molecular Property Data , 1994, J. Chem. Inf. Comput. Sci..