BioNavi-NP: Biosynthesis Navigator for Natural Products

Nature, a synthetic master, creates more than 300,000 natural products (NPs) which are the major constituents of FDA-proved drugs owing to the vast chemical space of NPs. To date, there are fewer than 30,000 validated NPs compounds involved in about 33,000 known enzyme catalytic reactions, and even fewer biosynthetic pathways are known with complete cascade-connected enzyme catalysis. Therefore, it is valuable to make computer-aided bio-retrosynthesis predictions. Here, we develop BioNavi-NP, a navigable and user-friendly toolkit, which is capable of predicting the biosynthetic pathways for NPs and NP-like compounds through a novel (AND-OR Tree)-based planning algorithm, an enhanced molecular Transformer neural network, and a training set that combines general organic transformations and biosynthetic steps. Extensive evaluations reveal that BioNavi-NP generalizes well to identifying the reported biosynthetic pathways for 90% of test compounds and recovering the verified building blocks for 73%, significantly outperforming conventional rule-based approaches. Moreover, BioNavi-NP also shows an outstanding capacity of biologically plausible pathways enumeration. In this sense, BioNavi-NP is a leading-edge toolkit to redesign complex biosynthetic pathways of natural products with applications to total or semi-synthesis and pathway elucidation or reconstruction.

[1]  Lorna J. Hepworth,et al.  RetroBioCat as a computer-aided synthesis planning tool for biocatalytic reactions and cascades , 2021, Nature Catalysis.

[2]  B. Franck Key building blocks of natural product biosynthesis and their significance in chemistry and medicine. , 1979, Angewandte Chemie.

[3]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[4]  Marwin H. S. Segler,et al.  Neural-Symbolic Machine Learning for Retrosynthesis and Reaction Prediction. , 2017, Chemistry.

[5]  Jun Xu,et al.  Predicting Retrosynthetic Reactions Using Self-Corrected Transformer Neural Networks , 2019, J. Chem. Inf. Model..

[6]  V. Hatzimanikatis,et al.  A computational workflow for the expansion of heterologous biosynthetic pathways to natural product derivatives , 2021, Nature communications.

[7]  Christopher A. Hunter,et al.  Molecular Transformer: A Model for Uncertainty-Calibrated Chemical Reaction Prediction , 2018, ACS central science.

[8]  Peter Ertl,et al.  Cheminformatics analysis of natural products: lessons from nature inspiring the design of new drugs. , 2008, Progress in drug research. Fortschritte der Arzneimittelforschung. Progres des recherches pharmaceutiques.

[9]  Jean-Louis Reymond,et al.  A probabilistic molecular fingerprint for big data settings , 2018, Journal of Cheminformatics.

[10]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[11]  Christopher A. Voigt,et al.  Retrosynthetic design of metabolic pathways to chemicals not found in nature , 2019, Current Opinion in Systems Biology.

[12]  Susumu Goto,et al.  PathPred: an enzyme-catalyzed metabolic pathway prediction server , 2010, Nucleic Acids Res..

[13]  Mike Preuss,et al.  Planning chemical syntheses with deep neural networks and symbolic AI , 2017, Nature.

[14]  W. Buckel,et al.  Production of Glutaconic Acid in a Recombinant Escherichia coli Strain , 2010, Applied and Environmental Microbiology.

[15]  Jiajun Zhang,et al.  A Comparable Study on Model Averaging, Ensembling and Reranking in NMT , 2018, NLPCC.

[16]  Peter D. Karp,et al.  The MetaCyc database of metabolic pathways and enzymes - a 2019 update , 2019, Nucleic Acids Res..

[17]  Sébastien Moretti,et al.  MetaNetX/MNXref - unified namespace for metabolites and biochemical reactions in the context of metabolic models , 2020, bioRxiv.

[18]  Tong Un Chae,et al.  Bio-based production of monomers and polymers by metabolically engineered microorganisms. , 2015, Current opinion in biotechnology.

[19]  T. Shinada,et al.  An Aromatic Farnesyltransferase Functions in Biosynthesis of the Anti-HIV Meroterpenoid Daurichromenic Acid1 , 2018, Plant Physiology.

[20]  Qipeng Yuan,et al.  De Novo Biosynthesis of Glutarate via α-Keto Acid Carbon Chain Extension and Decarboxylation Pathway in Escherichia coli. , 2017, ACS synthetic biology.

[21]  Daniel M. Lowe Extraction of chemical structures and reactions from the literature , 2012 .

[22]  James G. Jeffryes,et al.  A pathway for every product? Tools to discover and design plant metabolism. , 2018, Plant science : an international journal of experimental plant biology.

[23]  Xin Gao,et al.  MRE: a web tool to suggest foreign enzymes for the biosynthesis pathway design with competing endogenous reactions in mind , 2016, Nucleic Acids Res..

[24]  Lydia E. Kavraki,et al.  Prediction of drug metabolites using neural machine translation , 2020 .

[25]  R. Nash Natural product biosynthesis , 2002 .

[26]  S. Lee,et al.  Metabolic engineering of Escherichia coli for the production of 5-aminovalerate and glutarate as C5 platform chemicals. , 2013, Metabolic engineering.

[27]  J. Fothergill,et al.  Catabolism of L-lysine by Pseudomonas aeruginosa. , 1977, Journal of general microbiology.

[28]  G. Mellows,et al.  Biosynthesis of hirsutic acid C using 13C nuclear magnetic resonance spectroscopy , 1974 .

[29]  W. Buckel,et al.  Substrate specificity of 2-hydroxyglutaryl-CoA dehydratase from Clostridium symbiosum: toward a bio-based production of adipic acid. , 2011, Biochemistry.

[30]  J. Keasling,et al.  High-level semi-synthetic production of the potent antimalarial artemisinin , 2013, Nature.

[31]  Neil Swainston,et al.  Selenzyme: enzyme selection tool for pathway design , 2017, bioRxiv.

[32]  Jean-Louis Reymond,et al.  Visualization of very large high-dimensional data sets as minimum spanning trees , 2019, Journal of Cheminformatics.

[33]  Mathias Dunkel,et al.  Super Natural II—a database of natural products , 2014, Nucleic Acids Res..

[34]  Hongwei Liu,et al.  Stucturally Diverse Sesquiterpenes Produced by a Chinese Tibet Fungus Stereum hirsutum and Their Cytotoxic and Immunosuppressant Activities. , 2015, Organic letters.

[35]  David J Newman,et al.  Natural Products as Sources of New Drugs over the Nearly Four Decades from 01/1981 to 09/2019. , 2020, Journal of natural products.

[36]  Predicting enzymatic reactions with a molecular transformer† , 2021, Chemical science.

[37]  Le Song,et al.  Retro*: Learning Retrosynthetic Planning with Neural Guided A* Search , 2020, ICML.

[38]  Jianfeng Pei,et al.  Automatic retrosynthetic route planning using template-free models , 2020, Chemical science.