Stochastic voyages into uncharted chemical space produce a representative library of all possible drug-like compounds.

The "small molecule universe" (SMU), the set of all synthetically feasible organic molecules of 500 Da molecular weight or less, is estimated to contain over 10(60) structures, making exhaustive searches for structures of interest impractical. Here, we describe the construction of a "representative universal library" spanning the SMU that samples the full extent of feasible small molecule chemistries. This library was generated using the newly developed Algorithm for Chemical Space Exploration with Stochastic Search (ACSESS). ACSESS makes two important contributions to chemical space exploration: it allows the systematic search of the unexplored regions of the small molecule universe, and it facilitates the mining of chemical libraries that do not yet exist, providing a near-infinite source of diverse novel compounds.

[1]  Johann Gasteiger,et al.  A new model for calculating atomic charges in molecules , 1978 .

[2]  Kenneth J. Miller,et al.  Additions and Corrections - A New Empirical Method to Calculate Average Molecular Polarizabilities , 1979 .

[3]  W. Guida,et al.  The art and practice of structure‐based drug design: A molecular modeling perspective , 1996, Medicinal research reviews.

[4]  Johann Gasteiger,et al.  Assessing Similarity and Diversity of Combinatorial Libraries by Spatial Autocorrelation Functions and Neural Networks , 1996 .

[5]  G. Bemis,et al.  The properties of known drugs. 1. Molecular frameworks. , 1996, Journal of medicinal chemistry.

[6]  Andreas Zell,et al.  Locating Biologically Active Compounds in Medium-Sized Heterogeneous Datasets by Topological Autocorrelation Vectors: Dopamine and Benzodiazepine Agonists , 1996, J. Chem. Inf. Comput. Sci..

[7]  Luhua Lai,et al.  A New Atom-Additive Method for Calculating Partition Coefficients , 1997, J. Chem. Inf. Comput. Sci..

[8]  Dimitris K. Agrafiotis,et al.  Stochastic Algorithms for Maximizing Molecular Diversity , 1997, J. Chem. Inf. Comput. Sci..

[9]  H. Matter,et al.  Selecting optimally diverse compounds from structure databases: a validation study of two-dimensional and three-dimensional molecular descriptors. , 1997, Journal of medicinal chemistry.

[10]  Darren V. S. Green,et al.  Selecting Combinatorial Libraries to Optimize Diversity and Physical Properties , 1999, J. Chem. Inf. Comput. Sci..

[11]  P. Selzer,et al.  Fast calculation of molecular polar surface area as a sum of fragment-based contributions and its application to the prediction of drug transport properties. , 2000, Journal of medicinal chemistry.

[12]  Peter Willett,et al.  Designing focused libraries using MoSELECT. , 2002, Journal of molecular graphics & modelling.

[13]  Philip M. Dean,et al.  Molecular diversity in drug design , 2002 .

[14]  Valerie J. Gillet,et al.  Background Theory of Molecular Diversity , 2002 .

[15]  Johann Gasteiger,et al.  Handbook of Chemoinformatics , 2003 .

[16]  Wolfgang H. B. Sauer,et al.  Molecular Shape Diversity of Combinatorial Libraries: A Prerequisite for Broad Bioactivity , 2003, J. Chem. Inf. Comput. Sci..

[17]  Johann Gasteiger,et al.  The de novo design of median molecules within a property range of interest , 2004, J. Comput. Aided Mol. Des..

[18]  Li Liu,et al.  Topological Steric Effect Index and Its Application , 2004, J. Chem. Inf. Model..

[19]  R. Brereton,et al.  Handbook of chemoinformatics: from data to knowledge, edited by Johann Gasteiger, Volumes 1–4. Wiley‐VCH, Weinheim, 2003, ISBN 3527306803, €485 , 2004 .

[20]  Derek S. Tan,et al.  Diversity-oriented synthesis: exploring the intersections between chemistry and biology , 2005, Nature chemical biology.

[21]  Jean-Louis Reymond,et al.  Virtual exploration of the small-molecule chemical universe below 160 Daltons. , 2005, Angewandte Chemie.

[22]  Weitao Yang,et al.  Designing molecules by optimizing potentials. , 2006, Journal of the American Chemical Society.

[23]  Nathan Brown,et al.  Molecular optimization using computational multi-objective methods. , 2007, Current opinion in drug discovery & development.

[24]  Molecule Evoluator CIDRUX BV, Park Oosterspaarn 6, 2036 MB Haarlem, The Netherlands. www. cidrux.com. See Web site for pricing information. , 2007 .

[25]  J. Reymond,et al.  Chemical Space Travel , 2007, ChemMedChem.

[26]  Jean-Louis Reymond,et al.  Virtual Exploration of the Chemical Universe up to 11 Atoms of C, N, O, F: Assembly of 26.4 Million Structures (110.9 Million Stereoisomers) and Analysis for New Ring Systems, Stereochemistry, Physicochemical Properties, Compound Classes, and Drug Discovery , 2007, J. Chem. Inf. Model..

[27]  H. Schaefer,et al.  Predicting molecules--more realism, please! , 2008, Angewandte Chemie.

[28]  Tudor I. Oprea,et al.  Scaffold Topologies. 1. Exhaustive Enumeration up to Eight Rings , 2008, J. Chem. Inf. Model..

[29]  D. Bertrand,et al.  Discovery of NMDA Glycine Site Inhibitors from the Chemical Universe Database GDB , 2008, ChemMedChem.

[30]  Tudor I. Oprea,et al.  Scaffold Topologies. 2. Analysis of Chemical Databases , 2008, J. Chem. Inf. Model..

[31]  Weitao Yang,et al.  Exploring chemical space with discrete, gradient, and hybrid optimization methods. , 2008, The Journal of chemical physics.

[32]  Xiangqian Hu,et al.  A gradient-directed Monte Carlo approach to molecular design. , 2008, The Journal of chemical physics.

[33]  Stuart L. Schreiber,et al.  Organic chemistry: Molecular diversity by design , 2009, Nature.

[34]  Lorenz C. Blum,et al.  Classification of Organic Molecules by Molecular Quantum Numbers , 2009, ChemMedChem.

[35]  R. Todeschini,et al.  Molecular Descriptors for Chemoinformatics: Volume I: Alphabetical Listing / Volume II: Appendices, References , 2009 .

[36]  G. Schneider,et al.  Voyages to the (un)known: adaptive design of bioactive compounds. , 2009, Trends in biotechnology.

[37]  Lorenz C. Blum,et al.  970 million druglike small molecules for virtual screening in the chemical universe database GDB-13. , 2009, Journal of the American Chemical Society.

[38]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[39]  Peter Ertl,et al.  Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions , 2009, J. Cheminformatics.

[40]  David J Triggle,et al.  The chemist as astronaut: searching for biologically useful space in the chemical universe. , 2009, Biochemical pharmacology.

[41]  Roberto Todeschini,et al.  Molecular descriptors for chemoinformatics , 2009 .

[42]  Lorenz C. Blum,et al.  Identification of selective norbornane-type aspartate analogue inhibitors of the glutamate transporter 1 (GLT-1) from the chemical universe generated database (GDB). , 2010, Journal of medicinal chemistry.

[43]  Sivaraman Dandapani,et al.  Grand challenge commentary: Accessing new chemical space for 'undruggable' targets. , 2010, Nature chemical biology.

[44]  Benjamin A. Ellingson,et al.  Conformer Generation with OMEGA: Algorithm and Validation Using High Quality Structures from the Protein Databank and Cambridge Structural Database , 2010, J. Chem. Inf. Model..

[45]  Warren R. J. D. Galloway,et al.  Drug discovery: A question of library design , 2011, Nature.

[46]  Aaron B. Beeler,et al.  Organic Synthesis Toward Small-Molecule Probes and Drugs Special Feature: Discovery of new antimalarial chemotypes through chemical methodology and library development , 2011 .

[47]  Francesco Marchetti,et al.  Towards the systematic exploration of chemical space. , 2012, Organic & biomolecular chemistry.

[48]  Ryan G. Coleman,et al.  ZINC: A Free Tool to Discover Chemistry for Biology , 2012, J. Chem. Inf. Model..

[49]  Maria F. Sassano,et al.  Automated design of ligands to polypharmacological profiles , 2012, Nature.