Combining mechanistic and machine learning models for predictive engineering and optimization of tryptophan metabolism

Through advanced mechanistic modeling and the generation of large high-quality datasets, machine learning is becoming an integral part of understanding and engineering living systems. Here we show that mechanistic and machine learning models can be combined to enable accurate genotype-to-phenotype predictions. We use a genome-scale model to pinpoint engineering targets, efficient library construction of metabolic pathway designs, and high-throughput biosensor-enabled screening for training diverse machine learning algorithms. From a single data-generation cycle, this enables successful forward engineering of complex aromatic amino acid metabolism in yeast, with the best machine learning-guided design recommendations improving tryptophan titer and productivity by up to 74 and 43%, respectively, compared to the best designs used for algorithm training. Thus, this study highlights the power of combining mechanistic and machine learning models to effectively direct metabolic engineering efforts.

[1]  Adam M. Feist,et al.  Coupling S-adenosylmethionine–dependent methylation to growth: Design and uses , 2019, PLoS biology.

[2]  Thomas R. Schneider,et al.  Evolution of feedback-inhibited β/α barrel isoenzymes by gene duplication and a single mutation , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[3]  G. Stephanopoulos Metabolic fluxes and metabolic engineering. , 1999, Metabolic engineering.

[4]  Pablo Carbonell,et al.  Opportunities at the Intersection of Synthetic Biology, Machine Learning, and Automation. , 2019, ACS synthetic biology.

[5]  Jeffrey D Orth,et al.  What is flux balance analysis? , 2010, Nature Biotechnology.

[6]  Linda C Hsieh-Wilson,et al.  Phosphofructokinase 1 Glycosylation Regulates Cell Growth and Metabolism , 2012, Science.

[7]  W. Lipscomb,et al.  Evolution of 3-deoxy-D-arabino-heptulosonate-7-phosphate synthase-encoding genes in the yeast Saccharomyces cerevisiae. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Radhakrishnan Mahadevan,et al.  Metabolic engineering of a tyrosine-overproducing yeast platform using targeted metabolomics , 2015, Microbial Cell Factories.

[9]  Mathilde Koch,et al.  Large scale active-learning-guided exploration for in vitro protein production optimization , 2020, Nature Communications.

[10]  S. Henry,et al.  Revising the Representation of Fatty Acid, Glycerolipid, and Glycerophospholipid Metabolism in the Consensus Model of Yeast Metabolism. , 2013, Industrial biotechnology.

[11]  Hector Garcia Martin,et al.  A machine learning Automated Recommendation Tool for synthetic biology , 2019, Nature Communications.

[12]  Bumjoon J. Kim,et al.  One-step fermentative production of aromatic polyesters from glucose by metabolically engineered Escherichia coli strains , 2018, Nature Communications.

[13]  Christopher P. Long,et al.  Metabolic flux responses to deletion of 20 core enzymes reveal flexibility and limits of E. coli metabolism. , 2019, Metabolic engineering.

[14]  Jay D. Keasling,et al.  A Cas9-based toolkit to program gene expression in Saccharomyces cerevisiae , 2016, Nucleic acids research.

[15]  L. Eggeling,et al.  Pushing product formation to its limit: metabolic engineering of Corynebacterium glutamicum for L-leucine overproduction. , 2014, Metabolic engineering.

[16]  Adam M. Feist,et al.  The emergence of adaptive laboratory evolution as an efficient tool for biological discovery and industrial biotechnology. , 2019, Metabolic engineering.

[17]  David H. Wolpert,et al.  The Lack of A Priori Distinctions Between Learning Algorithms , 1996, Neural Computation.

[18]  G. Braus Aromatic amino acid biosynthesis in the yeast Saccharomyces cerevisiae: a model system for the regulation of a eukaryotic biosynthetic pathway. , 1991, Microbiological reviews.

[19]  Francisco Bolívar,et al.  Inactivation of Pyruvate Kinase or the Phosphoenolpyruvate: Sugar Phosphotransferase System Increases Shikimic and Dehydroshikimic Acid Yields from Glucose in Bacillus subtilis , 2013, Journal of Molecular Microbiology and Biotechnology.

[20]  M. Inui,et al.  Production of 4-Hydroxybenzoic Acid by an Aerobic Growth-Arrested Bioprocess Using Metabolically Engineered Corynebacterium glutamicum , 2018, Applied and Environmental Microbiology.

[21]  Edward J. O'Brien,et al.  Computing the functional proteome: recent progress and future prospects for genome-scale models. , 2015, Current opinion in biotechnology.

[22]  Jameson K. Rogers,et al.  Biosensor-based engineering of biosynthetic pathways. , 2016, Current opinion in biotechnology.

[23]  Jay D Keasling,et al.  Engineered reversal of function in glycolytic yeast promoters , 2019, bioRxiv.

[24]  Tom M. Conrad,et al.  Omic data from evolved E. coli are consistent with computed optimal growth from genome-scale models , 2010, Molecular systems biology.

[25]  Nan Xu,et al.  Comprehensive understanding of Saccharomyces cerevisiae phenotypes with whole‐cell model WM_S288C , 2020, Biotechnology and bioengineering.

[26]  L. Nielsen,et al.  Quorum-sensing linked RNA interference for dynamic metabolic pathway control in Saccharomyces cerevisiae. , 2015, Metabolic engineering.

[27]  J. Keasling Manufacturing Molecules Through Metabolic Engineering , 2010, Science.

[28]  Jonas Mockus,et al.  Application of Bayesian approach to numerical methods of global and stochastic optimization , 1994, J. Glob. Optim..

[29]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[30]  Benjamín J. Sánchez,et al.  Improving the phenotype predictions of a yeast genome‐scale metabolic model by incorporating enzymatic constraints , 2017, Molecular systems biology.

[31]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[32]  Hal S Alper,et al.  Biosensor‐Enabled Directed Evolution to Improve Muconic Acid Production in Saccharomyces cerevisiae , 2017, Biotechnology journal.

[33]  Edith D. Wong,et al.  Saccharomyces Genome Database: the genomics resource of budding yeast , 2011, Nucleic Acids Res..

[34]  Tao Yu,et al.  Rewiring carbon metabolism in yeast for high level production of aromatic chemicals , 2019, Nature Communications.

[35]  C. Yanofsky,et al.  Nucleotide sequence and expression of Escherichia coli trpR, the structural gene for the trp aporepressor. , 1980, Proceedings of the National Academy of Sciences of the United States of America.

[36]  J Yang,et al.  In vivo and in vitro studies of TrpR-DNA interactions. , 1996, Journal of molecular biology.

[37]  George N. Bennett,et al.  Improvement of NADPH bioavailability in Escherichia coli through the use of phosphofructokinase deficient strains , 2013, Applied Microbiology and Biotechnology.

[38]  J. Krömer,et al.  Metabolic Engineering of the Shikimate Pathway for Production of Aromatics and Derived Compounds—Present and Future Strain Construction Strategies , 2018, Front. Bioeng. Biotechnol..

[39]  Hal S Alper,et al.  Systems Metabolic Engineering Meets Machine Learning: A New Era for Data‐Driven Metabolic Engineering , 2019, Biotechnology journal.

[40]  Claudio Angione,et al.  Machine and deep learning meet genome-scale metabolic modeling , 2019, PLoS Comput. Biol..

[41]  Diogo M. Camacho,et al.  Next-Generation Machine Learning for Biological Networks , 2018, Cell.

[42]  Jay D Keasling,et al.  CasEMBLR: Cas9-Facilitated Multiloci Genomic Integration of in Vivo Assembled DNA Parts in Saccharomyces cerevisiae. , 2015, ACS synthetic biology.

[43]  Sven Panke,et al.  Combinatorial pathway optimization for streamlined metabolic engineering. , 2017, Current opinion in biotechnology.

[44]  Jeong Wook Lee,et al.  Systems metabolic engineering of microorganisms for natural and non-natural chemicals. , 2012, Nature chemical biology.

[45]  B. Pugh,et al.  Genome-wide structure and organization of eukaryotic pre-initiation complexes , 2011, Nature.

[46]  Andrew Gelman,et al.  Handbook of Markov Chain Monte Carlo , 2011 .

[47]  Kathleen A. Curran,et al.  Metabolic engineering of muconic acid production in Saccharomyces cerevisiae. , 2013, Metabolic engineering.

[48]  J. Keasling,et al.  Principal component analysis of proteomics (PCAP) as a tool to direct metabolic engineering. , 2015, Metabolic engineering.

[49]  Hiroaki Kitano,et al.  Identification of dosage-sensitive genes in Saccharomyces cerevisiae using the genetic tug-of-war method , 2013, Genome research.

[50]  G F Sprague,et al.  Isolation and characterization of a Saccharomyces cerevisiae mutant deficient in pyruvate kinase activity , 1977, Journal of bacteriology.

[51]  Zhikang Yin,et al.  Multiple signalling pathways trigger the exquisite sensitivity of yeast gluconeogenic mRNAs to glucose , 1996, Molecular microbiology.

[52]  Jay D Keasling,et al.  Multiplex metabolic pathway engineering using CRISPR/Cas9 in Saccharomyces cerevisiae. , 2015, Metabolic engineering.

[53]  J. Pronk,et al.  Connecting central carbon and aromatic amino acid metabolisms to improve de novo 2-phenylethanol production in Saccharomyces cerevisiae. , 2019, Metabolic engineering.

[54]  Anne Richelle,et al.  Creation and analysis of biochemical constraint-based models using the COBRA Toolbox v.3.0 , 2019, Nature Protocols.

[55]  S. Lee,et al.  Metabolic engineering of Corynebacterium glutamicum for L-arginine production , 2011, Nature Communications.

[56]  Sven Panke,et al.  Rationally reduced libraries for combinatorial pathway optimization minimizing experimental effort , 2016, Nature Communications.

[57]  J. Liao,et al.  Engineering of Escherichia coli central metabolism for aromatic metabolite production with near theoretical yield , 1994, Applied and environmental microbiology.

[58]  Nikolaus Sonnenschein,et al.  A consensus S. cerevisiae metabolic model Yeast8 and its ecosystem for comprehensively probing cellular metabolism , 2019, Nature Communications.

[59]  U. Sauer,et al.  Coordination of microbial metabolism , 2014, Nature Reviews Microbiology.

[60]  Elizabeth Brunk,et al.  Model-driven discovery of underground metabolic functions in Escherichia coli , 2015, Proceedings of the National Academy of Sciences.

[61]  Adam M. Feist,et al.  iML1515, a knowledgebase that computes Escherichia coli traits , 2017, Nature Biotechnology.

[62]  A. Oliveira,et al.  Global analysis of protein structural changes in complex proteomes , 2014, Nature Biotechnology.

[63]  Luis H. Reyes,et al.  Improving carotenoids production in yeast via adaptive laboratory evolution. , 2014, Metabolic engineering.

[64]  B. Palsson,et al.  Constraining the metabolic genotype–phenotype relationship using a phylogeny of in silico methods , 2012, Nature Reviews Microbiology.

[65]  J. Keasling,et al.  Targeted proteomics for metabolic pathway optimization: application to terpene production. , 2011, Metabolic engineering.

[66]  G. Braus,et al.  Analysis of feedback-resistant anthranilate synthases from Saccharomyces cerevisiae , 1993, Journal of bacteriology.

[67]  Florian David,et al.  EasyClone: method for iterative chromosomal integration of multiple genes in Saccharomyces cerevisiae , 2013, FEMS yeast research.

[68]  Christoph Wittmann,et al.  Consequences of phosphoenolpyruvate:sugar phosphotranferase system and pyruvate kinase isozymes inactivation in central carbon metabolism flux distribution in Escherichia coli , 2012, Microbial Cell Factories.

[69]  C. Maranas,et al.  Succinate Overproduction: A Case Study of Computational Strain Design Using a Comprehensive Escherichia coli Kinetic Model , 2015, Front. Bioeng. Biotechnol..

[70]  Florian David,et al.  Model-assisted fine-tuning of central carbon metabolism in yeast through dCas9-based regulation. , 2019, ACS synthetic biology.

[71]  Sean R. Collins,et al.  A comprehensive strategy enabling high-resolution functional analysis of the yeast genome , 2008, Nature Methods.

[72]  Gang Li,et al.  MiYA, an efficient machine-learning workflow in conjunction with the YeastFab assembly strategy for combinatorial optimization of heterologous metabolic pathways in Saccharomyces cerevisiae. , 2018, Metabolic engineering.

[73]  Michael W Deem,et al.  Parallel tempering: theory, applications, and new perspectives. , 2005, Physical chemistry chemical physics : PCCP.

[74]  Mia Hubert,et al.  Robust statistics for outlier detection , 2011, WIREs Data Mining Knowl. Discov..

[75]  Jean-Marc Daran,et al.  Pathway swapping: Toward modular engineering of essential cellular processes , 2016, Proceedings of the National Academy of Sciences.

[76]  C. Tomlin,et al.  Expression-level optimization of a multi-enzyme pathway in the absence of a high-throughput assay , 2013, Nucleic acids research.

[77]  Benjamín J. Sánchez,et al.  Absolute Quantification of Protein and mRNA Abundances Demonstrate Variability in Gene-Specific Translation Efficiency in Yeast. , 2017, Cell systems.

[78]  M. Künzler,et al.  Cloning, primary structure and regulation of the ARO4 gene, encoding the tyrosine-inhibited 3-deoxy-D-arabino-heptulosonate-7-phosphate synthase from Saccharomyces cerevisiae. , 1992, Gene.