Molecular Scaffold Analysis of Natural Products Databases in the Public Domain

Natural products represent important sources of bioactive compounds in drug discovery efforts. In this work, we compiled five natural products databases available in the public domain and performed a comprehensive chemoinformatic analysis focused on the content and diversity of the scaffolds with an overview of the diversity based on molecular fingerprints. The natural products databases were compared with each other and with a set of molecules obtained from in‐house combinatorial libraries, and with a general screening commercial library. It was found that publicly available natural products databases have different scaffold diversity. In contrast to the common concept that larger libraries have the largest scaffold diversity, the largest natural products collection analyzed in this work was not the most diverse. The general screening library showed, overall, the highest scaffold diversity. However, considering the most frequent scaffolds, the general reference library was the least diverse. In general, natural products databases in the public domain showed low molecule overlap. In addition to benzene and acyclic compounds, flavones, coumarins, and flavanones were identified as the most frequent molecular scaffolds across the different natural products collections. The results of this work have direct implications in the computational and experimental screening of natural product databases for drug discovery.

[1]  José L Medina-Franco,et al.  Computational methods for the discovery of mood disorder therapies , 2011, Expert opinion on drug discovery.

[2]  Anthony Nicholls,et al.  What do we know and when do we know it? , 2008, J. Comput. Aided Mol. Des..

[3]  Clemencia Pinilla,et al.  Strategies for the use of mixture-based synthetic combinatorial libraries: scaffold ranking, direct testing in vivo, and enhanced deconvolution by computational methods. , 2008, Journal of combinatorial chemistry.

[4]  Gisbert Schneider,et al.  Properties and Architecture of Drugs and Natural Products Revisited , 2007 .

[5]  José L. Medina-Franco,et al.  Scaffold Diversity Analysis of Compound Data Sets Using an Entropy-Based Measure , 2009 .

[6]  R. Glen,et al.  Molecular similarity: a key technique in molecular informatics. , 2004, Organic & biomolecular chemistry.

[7]  Peter Ertl,et al.  Natural Product-likeness Score and Its Application for Prioritization of Compound Libraries , 2008, J. Chem. Inf. Model..

[8]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[9]  J. Medina-Franco,et al.  Expanding the medicinally relevant chemical space with compound libraries. , 2012, Drug discovery today.

[10]  Thomas Scior,et al.  Large compound databases for structure-activity relationships studies in drug discovery. , 2007, Mini reviews in medicinal chemistry.

[11]  David J Newman,et al.  Natural products as leads to potential drugs: an old process or the new hope for drug discovery? , 2008, Journal of medicinal chemistry.

[12]  Kai-Wei Chang,et al.  iScreen: world’s first cloud-computing web server for virtual screening and de novo drug design based on TCM database@Taiwan , 2011, J. Comput. Aided Mol. Des..

[13]  A. M. Boldi,et al.  Libraries from natural product-like scaffolds. , 2004, Current opinion in chemical biology.

[14]  Sarah R. Langdon,et al.  Scaffold Diversity of Exemplified Medicinal Chemistry Space , 2011, J. Chem. Inf. Model..

[15]  Brian K. Shoichet,et al.  ZINC - A Free Database of Commercially Available Compounds for Virtual Screening , 2005, J. Chem. Inf. Model..

[16]  Stuart L. Schreiber,et al.  Quantifying structure and performance diversity for sets of small molecules comprising small-molecule screening collections , 2011, Proceedings of the National Academy of Sciences.

[17]  Mark Johnson,et al.  Algorithm for Naming Molecular Equivalence Classes Represented by Labeled Pseudographs , 2001, J. Chem. Inf. Comput. Sci..

[18]  Rajarshi Guha,et al.  Chemoinformatic Analysis of Combinatorial Libraries, Drugs, Natural Products, and Molecular Libraries Small Molecule Repository , 2009, J. Chem. Inf. Model..

[19]  Alan L Harvey,et al.  Current strategies for drug discovery through natural products , 2010, Expert opinion on drug discovery.

[20]  Clemencia Pinilla,et al.  Conformation-opioid activity relationships of bicyclic guanidines from 3D similarity analysis. , 2008, Bioorganic & medicinal chemistry.

[21]  José L. Medina-Franco,et al.  Visualization of Molecular Fingerprints , 2011, J. Chem. Inf. Model..

[22]  Calvin Yu-Chian Chen,et al.  TCM Database@Taiwan: The World's Largest Traditional Chinese Medicine Database for Drug Screening In Silico , 2011, PloS one.

[23]  Didier Rognan,et al.  Assessing the Scaffold Diversity of Screening Libraries , 2006, J. Chem. Inf. Model..

[24]  Austin B. Yongye,et al.  Identification, structure-activity relationships and molecular modeling of potent triamine and piperazine opioid ligands. , 2009, Bioorganic & medicinal chemistry.

[25]  J. Medina-Franco,et al.  Natural products as DNA methyltransferase inhibitors: a computer-aided discovery approach , 2010, Molecular Diversity.

[26]  J. Vederas,et al.  [Drug discovery and natural products: end of era or an endless frontier?]. , 2011, Biomeditsinskaia khimiia.

[27]  Mark Johnson,et al.  Using Molecular Equivalence Numbers To Visually Explore Structural Features that Distinguish Chemical Libraries , 2002, J. Chem. Inf. Comput. Sci..

[28]  Dimitris K. Agrafiotis,et al.  A Constant Time Algorithm for Estimating the Diversity of Large Chemical Libraries , 2001, J. Chem. Inf. Comput. Sci..

[29]  Jürgen Bajorath,et al.  Analysis of Chemical Information Content Using Shannon Entropy , 2007 .

[30]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[31]  Gerald M. Maggiora,et al.  Hierarchical Strategy for Identifying Active Chemotype Classes in Compound Databases , 2006, Chemical biology & drug design.

[32]  Zoya Titarenko,et al.  BioCores: identification of a drug/natural product-based privileged structural motif for small-molecule lead discovery , 2010, Molecular Diversity.

[33]  José L. Medina-Franco,et al.  Increased Diversity of Libraries from Libraries: Chemoinformatic Analysis of Bis‐Diazacyclic Libraries , 2011, Chemical biology & drug design.

[34]  Ola Engkvist,et al.  A comparative analysis of the molecular topologies for drugs, clinical candidates, natural products, human metabolites and general bioactive compounds , 2012 .

[35]  Alan L Harvey,et al.  Natural products in drug discovery. , 2008, Drug discovery today.

[36]  Alan L Harvey,et al.  The Drug Discovery Portal: a resource to enhance drug discovery from academia. , 2010, Drug discovery today.

[37]  Stefan Wetzel,et al.  The Scaffold Tree - Visualization of the Scaffold Universe by Hierarchical Scaffold Classification , 2007, J. Chem. Inf. Model..

[38]  J. Vederas,et al.  Drug Discovery and Natural Products: End of an Era or an Endless Frontier? , 2009, Science.

[39]  Kit S Lam,et al.  Synthesis of flavonoid analogues as scaffolds for natural product-based combinatorial libraries. , 2007, Journal of combinatorial chemistry.

[40]  Andreas Bender,et al.  Recognizing Pitfalls in Virtual Screening: A Critical Review , 2012, J. Chem. Inf. Model..

[41]  A. Ganesan The impact of natural products upon modern drug discovery. , 2008, Current opinion in chemical biology.

[42]  Alberto Del Rio,et al.  Freely accessible databases of commercial compounds for high- throughput virtual screenings. , 2012, Current topics in medicinal chemistry.

[43]  Ansgar Schuffenhauer,et al.  Rule‐Based Classification of Chemical Structures by Scaffold , 2011, Molecular informatics.

[44]  Thomas Henkel,et al.  Statistical Investigation into the Structural Complementarity of Natural Products and Synthetic Compounds. , 1999, Angewandte Chemie.

[45]  Jürgen Bajorath,et al.  Distinguishing between Natural Products and Synthetic Molecules by Descriptor Shannon Entropy Analysis and Binary QSAR Calculations , 2000, J. Chem. Inf. Comput. Sci..

[46]  Tudor I. Oprea,et al.  Novel chemical space exploration via natural products. , 2009, Journal of medicinal chemistry.

[47]  M. Butler Natural products to drugs: natural product-derived compounds in clinical trials. , 2005, Natural product reports.

[48]  Masanori Arita,et al.  Databases on food phytochemicals and their health-promoting effects. , 2011, Journal of agricultural and food chemistry.

[49]  Yoo Jakyung,et al.  Chemoinformatic Approaches for Inhibitors of DNA Methyltransferases: Comprehensive Characterization of Screening Libraries , 2011 .

[50]  Cecilia Alsmark,et al.  Natural products in modern life science , 2010, Phytochemistry Reviews.

[51]  A. Schuffenhauer,et al.  Charting biologically relevant chemical space: a structural classification of natural products (SCONP). , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[52]  A. H. Lipkus,et al.  Structural Diversity of Organic Chemistry. a Scaffold Analysis of the Cas Registry , 2022 .

[53]  Mark S Butler,et al.  Natural products to drugs: natural product derived compounds in clinical trials. , 2005, Natural product reports.

[54]  Nathan Brown,et al.  On scaffolds and hopping in medicinal chemistry. , 2006, Mini reviews in medicinal chemistry.

[55]  G. Bemis,et al.  The properties of known drugs. 1. Molecular frameworks. , 1996, Journal of medicinal chemistry.