Cheminformatics in Natural Product‐based Drug Discovery

This review seeks to provide a timely survey of the scope and limitations of cheminformatics methods in natural product‐based drug discovery. Following an overview of data resources of chemical, biological and structural information on natural products, we discuss, among other aspects, in silico methods for (i) data curation and natural products dereplication, (ii) analysis, visualization, navigation and comparison of the chemical space, (iii) quantification of natural product‐likeness, (iv) prediction of the bioactivities (virtual screening, target prediction), ADME and safety profiles (toxicity) of natural products, (v) natural products‐inspired de novo design and (vi) prediction of natural products prone to cause interference with biological assays. Among the many methods discussed are rule‐based, similarity‐based, shape‐based, pharmacophore‐based and network‐based approaches, docking and machine learning methods.

[1]  Johannes Kirchmair,et al.  NP-Scout: Machine Learning Approach for the Quantification and Visualization of the Natural Product-Likeness of Small Molecules , 2019, Biomolecules.

[2]  Gregory A Landrum,et al.  Improving Conformer Generation for Small Rings and Macrocycles Based on Distance Geometry and Experimental Torsional-Angle Preferences , 2020, J. Chem. Inf. Model..

[3]  Tudor I. Oprea,et al.  Badapple: promiscuity patterns from noisy evidence , 2016, Journal of Cheminformatics.

[4]  H. H. Mao,et al.  A Convolutional Neural Network-Based Approach for the Rapid Characterization of Molecularly Diverse Natural Products. , 2020, Journal of the American Chemical Society.

[5]  Christoph Steinbeck,et al.  Natural product-likeness score revisited: an open-source, open-data implementation , 2012, BMC Bioinformatics.

[6]  Markus Hartenfeller,et al.  DOGS: Reaction-Driven de novo Design of Bioactive Compounds , 2012, PLoS Comput. Biol..

[7]  J. Irwin,et al.  An Aggregation Advisor for Ligand Discovery. , 2015, Journal of medicinal chemistry.

[8]  Judith M Rollinger,et al.  Novel neuraminidase inhibitors: identification, biological evaluation and investigations of the binding mode. , 2011, Future medicinal chemistry.

[9]  R. Hicklin,et al.  Synthesis of complex and diverse compounds through ring distortion of abietic acid. , 2014, Angewandte Chemie.

[10]  Stefan Wetzel,et al.  Bioactivity-guided mapping and navigation of chemical space. , 2009, Nature chemical biology.

[11]  Rommie E. Amaro,et al.  A Virtual Screening Approach For Identifying Plants with Anti H5N1 Neuraminidase Activity , 2015, J. Chem. Inf. Model..

[12]  Oliver Werz,et al.  Machine intelligence decrypts β-lapachone as an allosteric 5-lipoxygenase inhibitor† †Electronic supplementary information (ESI) available: Supplementary figures, data and methods. See DOI: 10.1039/c8sc02634c , 2018, Chemical science.

[13]  Chris S. Thomas,et al.  Natural Product Discovery Using Planes of Principal Component Analysis in R (PoPCAR) , 2017, Metabolites.

[14]  Ola Engkvist,et al.  A comparative analysis of the molecular topologies for drugs, clinical candidates, natural products, human metabolites and general bioactive compounds , 2012 .

[15]  A. Backlund,et al.  In silico comparison of marine, terrestrial and synthetic compounds using ChemGPS-NP for navigating chemical space , 2012, Phytochemistry Reviews.

[16]  Stefan Günther,et al.  The Purchasable Chemical Space: A Detailed Picture , 2015, J. Chem. Inf. Model..

[17]  Susana P. Gaudêncio,et al.  Dereplication: racing to speed up the natural products discovery process. , 2015, Natural product reports.

[18]  Petra Schneider,et al.  Revealing the macromolecular targets of complex natural products. , 2014, Nature chemistry.

[19]  Christoph Steinbeck,et al.  Building blocks for automated elucidation of metabolites: natural product-likeness for candidate ranking , 2014, BMC Bioinformatics.

[20]  P. Ertl,et al.  A Systematic Cheminformatics Analysis of Functional Groups Occurring in Natural Products. , 2019, Journal of natural products.

[21]  R. Kiss,et al.  Novel trisubstituted harmine derivatives with original in vitro anticancer activity. , 2012, Journal of medicinal chemistry.

[22]  A. Kinghorn,et al.  Chemical Diversity of Metabolites from Fungi, Cyanobacteria, and Plants Relative to FDA-Approved Anticancer Agents. , 2012, ACS medicinal chemistry letters.

[23]  Gisbert Schneider,et al.  Scaffold diversity of natural products: inspiration for combinatorial library design. , 2008, Natural product reports.

[24]  Lirong Chen,et al.  Use of Natural Products as Chemical Library for Drug Discovery and Network Pharmacology , 2013, PloS one.

[25]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[26]  Y. Wu,et al.  Anti-allergic Hydroxy Fatty Acids from Typhonium blumei Explored through ChemGPS-NP , 2017, Front. Pharmacol..

[27]  R. Hicklin,et al.  A ring-distortion strategy to construct stereochemically complex and structurally diverse compounds from natural products. , 2013, Nature chemistry.

[28]  Michael J. Keiser,et al.  Relating protein pharmacology by ligand chemistry , 2007, Nature Biotechnology.

[29]  Qian Li,et al.  Drug-likeness analysis of traditional Chinese medicines: 1. property distributions of drug-like compounds, non-drug-like compounds and natural compounds from traditional Chinese medicines , 2012, Journal of Cheminformatics.

[30]  A. Harvey,et al.  The re-emergence of natural products for drug discovery in the genomics era , 2015, Nature Reviews Drug Discovery.

[31]  Y. Tseng,et al.  NP-StructurePredictor: Prediction of Unknown Natural Products in Plant Mixtures , 2017, J. Chem. Inf. Model..

[32]  Gisbert Schneider,et al.  Computer-Assisted Discovery of Retinoid X Receptor Modulating Natural Products and Isofunctional Mimetics. , 2018, Journal of medicinal chemistry.

[33]  Aurélien Grosdidier,et al.  SwissTargetPrediction: a web server for target prediction of bioactive small molecules , 2014, Nucleic Acids Res..

[34]  Johannes Kirchmair,et al.  Hit Dexter 2.0: Machine-Learning Models for the Prediction of Frequent Hitters , 2019, J. Chem. Inf. Model..

[35]  Ronald J. Quinn,et al.  Capturing Nature's Diversity , 2015, PloS one.

[36]  Shuxing Zhang,et al.  Computational polypharmacology: a new paradigm for drug discovery , 2017, Expert opinion on drug discovery.

[37]  Matthias Rarey,et al.  High-Quality Dataset of Protein-Bound Ligand Conformations and Its Application to Benchmarking Conformer Ensemble Generators , 2017, J. Chem. Inf. Model..

[38]  Gisbert Schneider,et al.  A Computational Method for Unveiling the Target Promiscuity of Pharmacologically Active Compounds. , 2017, Angewandte Chemie.

[39]  Mark A. Murcko,et al.  Virtual screening : an overview , 1998 .

[40]  Peter Ertl,et al.  Natural Product-likeness Score and Its Application for Prioritization of Compound Libraries , 2008, J. Chem. Inf. Model..

[41]  J. Medina-Franco,et al.  Expanding the medicinally relevant chemical space with compound libraries. , 2012, Drug discovery today.

[42]  Prashanth Athri,et al.  Web-based drug repurposing tools: a survey , 2019, Briefings Bioinform..

[43]  Curtis J Henrich,et al.  Matching the power of high throughput screening to the chemical diversity of natural products. , 2013, Natural product reports.

[44]  Dik-Lung Ma,et al.  Molecular docking for virtual screening of natural product databases , 2011 .

[45]  Andrea Volkamer,et al.  Advances and Challenges in Computational Target Prediction , 2019, J. Chem. Inf. Model..

[46]  U. Hentschel,et al.  Dereplication Strategies for Targeted Isolation of New Antitrypanosomal Actinosporins A and B from a Marine Sponge Associated-Actinokineospora sp. EG49 , 2014, Marine drugs.

[47]  Jonathan Bisson,et al.  Can Invalid Bioactives Undermine Natural Product-Based Drug Discovery? , 2015, Journal of medicinal chemistry.

[48]  Tiago Rodrigues,et al.  Harnessing the potential of natural products in drug discovery from a cheminformatics vantage point. , 2017, Organic & biomolecular chemistry.

[49]  Michael J. Keiser,et al.  Large Scale Prediction and Testing of Drug Activity on Side-Effect Targets , 2012, Nature.

[50]  Ya Chen,et al.  Validation strategies for target prediction methods , 2019, Briefings Bioinform..

[51]  Melvin J. Yu Natural Product-Like Virtual Libraries: Recursive Atom-Based Enumeration , 2011, J. Chem. Inf. Model..

[52]  John J. Irwin,et al.  ZINC 15 – Ligand Discovery for Everyone , 2015, J. Chem. Inf. Model..

[53]  Ya Chen,et al.  Scope of 3D Shape-Based Approaches in Predicting the Macromolecular Targets of Structurally Complex Small Molecules Including Natural Products and Macrocyclic Ligands , 2020, J. Chem. Inf. Model..

[54]  Chen-Yang Jia,et al.  A drug-likeness toolbox facilitates ADMET study in drug discovery. , 2019, Drug discovery today.

[55]  G. Bemis,et al.  The properties of known drugs. 1. Molecular frameworks. , 1996, Journal of medicinal chemistry.

[56]  Gisbert Schneider,et al.  Design of Natural‐Product‐Inspired Multitarget Ligands by Machine Learning , 2019, ChemMedChem.

[57]  Jesús Martín,et al.  Combined LC/UV/MS and NMR Strategies for the Dereplication of Marine Natural Products , 2016, Planta Medica.

[58]  Cheminformatics Analysis of Natural Product Scaffolds: Comparison of Scaffolds Produced by Animals, Plants, Fungi and Bacteria , 2020, Molecular informatics.

[59]  Hung Cao,et al.  Plant Metabolite Databases: From Herbal Medicines to Modern Drug Discovery. , 2019, Journal of chemical information and modeling.

[60]  Sheo B. Singh,et al.  Chapter 2:Chemical Space and the Difference Between Natural Products and Synthetics , 2009 .

[61]  Thierry Langer,et al.  LigandScout: 3-D Pharmacophores Derived from Protein-Bound Ligands and Their Use as Virtual Screening Filters , 2005, J. Chem. Inf. Model..

[62]  G. Schneider,et al.  Revealing the Macromolecular Targets of Fragment-Like Natural Products. , 2015, Angewandte Chemie.

[63]  Darcy C Burns,et al.  The role of computer-assisted structure elucidation (CASE) programs in the structure elucidation of complex natural products. , 2019, Natural product reports.

[64]  L. Pilkington A Chemometric Analysis of Deep-Sea Natural Products , 2019, Molecules.

[65]  Calvin Yu-Chian Chen,et al.  TCM Database@Taiwan: The World's Largest Traditional Chinese Medicine Database for Drug Screening In Silico , 2011, PloS one.

[66]  Petra Schneider,et al.  From Complex Natural Products to Simple Synthetic Mimetics by Computational De Novo Design. , 2016, Angewandte Chemie.

[67]  P. Hawkins,et al.  Comparison of shape-matching and docking as virtual screening tools. , 2007, Journal of medicinal chemistry.

[68]  Sorel Muresan,et al.  ChemGPS-NP: tuned for navigation in biologically relevant chemical space. , 2006, Journal of natural products.

[69]  F. Koehn,et al.  The evolving role of natural products in drug discovery , 2005, Nature Reviews Drug Discovery.

[70]  Karsten Klein,et al.  Scaffold Hunter: a comprehensive visual analytics framework for drug discovery , 2017, Journal of Cheminformatics.

[71]  Dieter Lang,et al.  Predicting drug metabolism: experiment and/or computation? , 2015, Nature Reviews Drug Discovery.

[72]  Egon L. Willighagen,et al.  The Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo-and Bioinformatics , 2003, J. Chem. Inf. Comput. Sci..

[73]  H. Waldmann,et al.  Principle and design of pseudo-natural products , 2020, Nature Chemistry.

[74]  Mathias Dunkel,et al.  Super Natural II—a database of natural products , 2014, Nucleic Acids Res..

[75]  J. Rollinger,et al.  Natural products modulating the hERG channel: heartaches and hope† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c7np00014f , 2017, Natural product reports.

[76]  Hiroshi Mamitsuka,et al.  Current status and prospects of computational resources for natural product dereplication: a review , 2016, Briefings Bioinform..

[77]  Jérôme Hert,et al.  Quantifying Biogenic Bias in Screening Libraries , 2009, Nature chemical biology.

[78]  José L. Medina-Franco,et al.  Statistical-based database fingerprint: chemical space dependent representation of compound databases , 2018, Journal of Cheminformatics.

[79]  Stefan Wetzel,et al.  Charting, navigating, and populating natural product chemical space for drug discovery. , 2012, Journal of medicinal chemistry.

[80]  Petra Schneider,et al.  Identifying the macromolecular targets of de novo-designed chemical entities through self-organizing map consensus , 2014, Proceedings of the National Academy of Sciences.

[81]  Stuart L. Schreiber,et al.  Small molecules of different origins have distinct distributions of structural complexity that correlate with protein-binding profiles , 2010, Proceedings of the National Academy of Sciences.

[82]  Maria Sorokina,et al.  Review on natural products databases: where to find data in 2020 , 2020, Journal of Cheminformatics.

[83]  Gisbert Schneider,et al.  Tuning artificial intelligence on the de novo design of natural-product-inspired retinoid X receptor modulators , 2018, Communications Chemistry.

[84]  Miklos Feher,et al.  Property Distributions: Differences between Drugs, Natural Products, and Molecules from Combinatorial Chemistry , 2003, J. Chem. Inf. Comput. Sci..

[85]  Daniela Schuster,et al.  Pharmacophore-based discovery of FXR agonists. Part I: Model development and experimental validation , 2011, Bioorganic & medicinal chemistry.

[86]  J. Medina-Franco,et al.  Cheminformatic characterization of natural products from Panama , 2017, Molecular Diversity.

[87]  David J Newman,et al.  Natural Products as Sources of New Drugs over the Nearly Four Decades from 01/1981 to 09/2019. , 2020, Journal of natural products.

[88]  Xiaolin Cheng,et al.  STarFish: A Stacked Ensemble Target Fishing Approach and its Application to Natural Products , 2019, J. Chem. Inf. Model..

[89]  Norberto Sánchez-Cruz,et al.  A Fragment Library of Natural Products and its Comparative Chemoinformatic Characterization , 2020, Molecular informatics.

[90]  George Papadatos,et al.  The ChEMBL bioactivity database: an update , 2013, Nucleic Acids Res..

[91]  Justin J. J. van der Hooft,et al.  The Natural Products Atlas: An Open Access Knowledge Base for Microbial Natural Products Discovery , 2019, ACS central science.

[92]  C. E. Peishoff,et al.  A critical assessment of docking programs and scoring functions. , 2006, Journal of medicinal chemistry.

[93]  Rommie E. Amaro,et al.  Ensemble Docking in Drug Discovery. , 2018, Biophysical journal.

[94]  Matthias Rarey,et al.  Benchmarking Commercial Conformer Ensemble Generators , 2017, J. Chem. Inf. Model..

[95]  G. Schneider,et al.  Deorphaning the Macromolecular Targets of the Natural Anticancer Compound Doliculide. , 2016, Angewandte Chemie.

[96]  Chee-Keong Kwoh,et al.  Computational prediction of drug-target interactions using chemogenomic approaches: an empirical survey , 2019, Briefings Bioinform..

[97]  Thierry Langer,et al.  In Silico Workflow for the Discovery of Natural Products Activating the G Protein-Coupled Bile Acid Receptor 1 , 2018, Front. Chem..

[98]  Adrià Cereto-Massagué,et al.  Tools for in silico target fishing. , 2015, Methods.

[99]  Judith M. Rollinger,et al.  Virtual Screening for the Discovery of Active Principles from Natural Products , 2018 .

[100]  J. Rollinger,et al.  Computer-Guided Approach to Access the Anti-influenza Activity of Licorice Constituents , 2013, Journal of natural products.

[101]  M. Mahoney Zinc , 2020, Reactions Weekly.

[102]  J. Baell,et al.  New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. , 2010, Journal of medicinal chemistry.

[103]  David J Newman,et al.  Cheminformatic comparison of approved drugs from natural product versus synthetic origins. , 2015, Bioorganic & medicinal chemistry letters.

[104]  Thierry Langer,et al.  In silico Target Fishing for Rationalized Ligand Discovery Exemplified on Constituents of Ruta graveolens , 2008, Planta medica.

[105]  J. Rollinger,et al.  Antiviral potential and molecular insight into neuraminidase inhibiting diarylheptanoids from Alpinia katsumadai. , 2010, Journal of medicinal chemistry.

[106]  A. Olğaç,et al.  The potential role of in silico approaches to identify novel bioactive molecules from natural resources. , 2017, Future medicinal chemistry.

[107]  Kyoung Tai No,et al.  Development of Natural Compound Molecular Fingerprint (NC-MFP) with the Dictionary of Natural Products (DNP) for natural product-based drug development , 2020, Journal of Cheminformatics.

[108]  Weiping Chen,et al.  CMAUP: a database of collective molecular activities of useful plants , 2018, Nucleic Acids Res..

[109]  José L. Medina-Franco,et al.  Scaffold Diversity of Fungal Metabolites , 2017, Front. Pharmacol..

[110]  Herbert Waldmann,et al.  (-)-Englerin A is a potent and selective activator of TRPC4 and TRPC5 calcium channels. , 2015, Angewandte Chemie.

[111]  Feixiong Cheng,et al.  Quantitative and Systems Pharmacology 3. Network-Based Identification of New Targets for Natural Products Enables Potential Uses in Aging-Associated Disorders , 2017, Front. Pharmacol..

[112]  J Willem M Nissink,et al.  Seven Year Itch: Pan-Assay Interference Compounds (PAINS) in 2017—Utility and Limitations , 2017, ACS chemical biology.

[113]  P Schneider,et al.  De-orphaning the marine natural product (±)-marinopyrrole A by computational target prediction and biochemical validation. , 2017, Chemical communications.

[114]  J. Kirchmair,et al.  Data Resources for the Computer-Guided Discovery of Bioactive Natural Products , 2017, J. Chem. Inf. Model..

[115]  José L. Medina-Franco,et al.  Chemical Space and Diversity of the NuBBE Database: A Chemoinformatic Characterization. , 2018, Journal of chemical information and modeling.

[116]  Xiao Wang,et al.  Complex macrocycle exploration: parallel, heuristic, and constraint-based conformer generation using ForceGen , 2019, Journal of Computer-Aided Molecular Design.

[117]  Yu Kang,et al.  Cheminformatic Insight into the Differences between Terrestrial and Marine Originated Natural Products , 2018, J. Chem. Inf. Model..

[118]  Petra Schneider,et al.  Chemography of Natural Product Space , 2015, Planta Medica.

[119]  A. Odermatt,et al.  11beta-Hydroxysteroid dehydrogenase 1 inhibiting constituents from Eriobotrya japonica revealed by bioactivity-guided isolation and computational approaches. , 2010, Bioorganic & medicinal chemistry.

[120]  Thomas Henkel,et al.  Statistical Investigation into the Structural Complementarity of Natural Products and Synthetic Compounds. , 1999, Angewandte Chemie.

[121]  Nils-Ole Friedrich,et al.  Characterization of the Chemical Space of Known and Readily Obtainable Natural Products , 2018, J. Chem. Inf. Model..

[122]  Jing Mao,et al.  Computer-Assisted Drug Virtual Screening Based on Natural Product Databases. , 2019, Current pharmaceutical biotechnology.

[123]  G. Schneider,et al.  Shape Similarity by Fractal Dimensionality: An Application in the de novo Design of (−)‐Englerin A Mimetics , 2020, ChemMedChem.

[124]  Petra Schneider,et al.  Counting on natural products for drug design. , 2016, Nature chemistry.

[125]  Klaus-Robert Müller,et al.  From Machine Learning to Natural Product Derivatives that Selectively Activate Transcription Factor PPARγ , 2010, ChemMedChem.

[126]  José L. Medina-Franco,et al.  Consensus Diversity Plots: a global diversity analysis of chemical libraries , 2016, Journal of Cheminformatics.

[127]  Daniela Schuster,et al.  Pharmacophore-based discovery of FXR-agonists. Part II: Identification of bioactive triterpenes from Ganoderma lucidum , 2011, Bioorganic & medicinal chemistry.

[128]  A. Schuffenhauer,et al.  Charting biologically relevant chemical space: a structural classification of natural products (SCONP). , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[129]  Daniel Svozil,et al.  FAME 3: Predicting the Sites of Metabolism in Synthetic Compounds and Natural Products for Phase 1 and Phase 2 Metabolic Enzymes , 2019, J. Chem. Inf. Model..

[130]  Andreas Bender,et al.  The challenges involved in modeling toxicity data in silico: a review. , 2012, Current pharmaceutical design.

[131]  Tudor I. Oprea,et al.  Chemography: the Art of Navigating in Chemical Space , 2000 .

[132]  Rolf Larsson,et al.  ChemGPS‐NP Mapping of Chemical Compounds for Prediction of Anticancer Mode of Action , 2009 .

[133]  Cheng Luo,et al.  In silico ADME/T modelling for rational drug design , 2015, Quarterly Reviews of Biophysics.

[134]  Xin Zhou,et al.  Privileged Scaffold Analysis of Natural Products with Deep Learning‐based Indication Prediction Model , 2020, Molecular informatics.

[135]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[136]  Stefan Wetzel,et al.  Biology-oriented synthesis. , 2011, Angewandte Chemie.

[137]  Victor Uc Cetina,et al.  Prediction of Natural Product Classes Using Machine Learning and 13C NMR Spectroscopic Data , 2020, J. Chem. Inf. Model..

[138]  Christoph Steinbeck,et al.  NaPLeS: a natural products likeness scorer—web application and database , 2019, Journal of Cheminformatics.

[139]  T. Henkel,et al.  Statistische Untersuchungen zur Strukturkomplementarität von Naturstoffen und synthetischen Substanzen , 1999 .

[140]  Michael J. Keiser,et al.  Predicting new molecular targets for known drugs , 2009, Nature.

[141]  Chuipu Cai,et al.  Quantitative and Systems Pharmacology. 1. In Silico Prediction of Drug-Target Interactions of Natural Products Enables New Targeted Cancer Therapy , 2017, J. Chem. Inf. Model..

[142]  Johannes Kirchmair,et al.  Similarity-Based Methods and Machine Learning Approaches for Target Prediction in Early Drug Discovery: Performance and Scope , 2020, International journal of molecular sciences.

[143]  Renaldo Mendoza,et al.  ALARM NMR: a rapid and robust experimental method to detect reactive false positives in biochemical screens. , 2005, Journal of the American Chemical Society.

[144]  Stuart L Schreiber,et al.  Towards the optimal screening collection: a synthesis strategy. , 2008, Angewandte Chemie.

[145]  D. Newman,et al.  Biodiversity: A continuing source of novel drug leads , 2005 .

[146]  Daniela Schuster,et al.  Discovery and resupply of pharmacologically active plant-derived natural products: A review , 2015, Biotechnology advances.

[147]  Matthias Rarey,et al.  Conformator: A Novel Method for the Generation of Conformer Ensembles , 2019, J. Chem. Inf. Model..

[148]  Stanislaw Wlodek,et al.  Conformational Sampling of Macrocyclic Drugs in Different Environments: Can We Find the Relevant Conformations? , 2018, ACS omega.

[149]  Stefan Wetzel,et al.  Cheminformatic Analysis of Natural Products and their Chemical Space , 2007 .

[150]  Florbela Pereira,et al.  Computational Methodologies in the Exploration of Marine Natural Product Leads , 2018, Marine drugs.

[151]  B. Shoichet,et al.  A common mechanism underlying promiscuous inhibitors from virtual and high-throughput screening. , 2002, Journal of medicinal chemistry.

[152]  Francisco Corzana,et al.  Unveiling (−)‐Englerin A as a Modulator of L‐Type Calcium Channels , 2016, Angewandte Chemie.