Rapid Prediction of Bacterial Heterotrophic Fluxomics Using Machine Learning and Constraint Programming

13C metabolic flux analysis (13C-MFA) has been widely used to measure in vivo enzyme reaction rates (i.e., metabolic flux) in microorganisms. Mining the relationship between environmental and genetic factors and metabolic fluxes hidden in existing fluxomic data will lead to predictive models that can significantly accelerate flux quantification. In this paper, we present a web-based platform MFlux (http://mflux.org) that predicts the bacterial central metabolism via machine learning, leveraging data from approximately 100 13C-MFA papers on heterotrophic bacterial metabolisms. Three machine learning methods, namely Support Vector Machine (SVM), k-Nearest Neighbors (k-NN), and Decision Tree, were employed to study the sophisticated relationship between influential factors and metabolic fluxes. We performed a grid search of the best parameter set for each algorithm and verified their performance through 10-fold cross validations. SVM yields the highest accuracy among all three algorithms. Further, we employed quadratic programming to adjust flux profiles to satisfy stoichiometric constraints. Multiple case studies have shown that MFlux can reasonably predict fluxomes as a function of bacterial species, substrate types, growth rate, oxygen conditions, and cultivation methods. Due to the interest of studying model organism under particular carbon sources, bias of fluxome in the dataset may limit the applicability of machine learning models. This problem can be resolved after more papers on 13C-MFA are published for non-model species.

[1]  Wolfgang Wiechert,et al.  13CFLUX2—high-performance software suite for 13C-metabolic flux analysis , 2012, Bioinform..

[2]  J. Stelling,et al.  Transcriptional regulation is insufficient to explain substrate-induced flux changes in Bacillus subtilis , 2013, Molecular systems biology.

[3]  X. Wang,et al.  Predicting hepatitis B virus–positive metastatic hepatocellular carcinomas using gene expression profiling and supervised machine learning , 2003, Nature Medicine.

[4]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[5]  G. Stephanopoulos,et al.  Metabolic flux analysis in a nonstationary system: fed-batch fermentation of a high yielding strain of E. coli producing 1,3-propanediol. , 2007, Metabolic engineering.

[6]  Chao Li,et al.  CeCaFDB: a curated database for the documentation, visualization and comparative analysis of central carbon metabolic flux distributions explored by 13C-fluxomics , 2014, Nucleic Acids Res..

[7]  Xueyang Feng,et al.  Incomplete Wood–Ljungdahl pathway facilitates one-carbon metabolism in organohalide-respiring Dehalococcoides mccartyi , 2014, Proceedings of the National Academy of Sciences.

[8]  G. Church,et al.  Analysis of optimality in natural and perturbed metabolic networks , 2002 .

[9]  J. Reed,et al.  Synergy between (13)C-metabolic flux analysis and flux balance analysis for understanding metabolic adaptation to anaerobiosis in E. coli. , 2011, Metabolic engineering.

[10]  U. Sauer,et al.  Experimental Identification and Quantification of Glucose Metabolism in Seven Bacterial Species , 2005, Journal of bacteriology.

[11]  Yinjie J. Tang,et al.  Pathway Confirmation and Flux Analysis of Central Metabolic Pathways in Desulfovibrio vulgaris Hildenborough using Gas Chromatography-Mass Spectrometry and Fourier Transform-Ion Cyclotron Resonance Mass Spectrometry , 2006, Journal of bacteriology.

[12]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[13]  Nicola Zamboni,et al.  FiatFlux – a software for metabolic flux analysis from 13C-glucose experiments , 2005, BMC Bioinformatics.

[14]  H Sahm,et al.  the Czech Republic, , 2022 .

[15]  Peter D. Karp,et al.  Machine learning methods for metabolic pathway prediction , 2010 .

[16]  Yinjie J. Tang,et al.  Flux Analysis of Central Metabolic Pathways in Geobacter metallireducens during Reduction of Soluble Fe(III)-Nitrilotriacetic Acid , 2007, Applied and Environmental Microbiology.

[17]  Peter J. Stuckey,et al.  Programming with Constraints: An Introduction , 1998 .

[18]  K. Nakahigashi,et al.  Catabolic regulation analysis of Escherichia coli and its crp, mlc, mgsA, pgi and ptsG mutants , 2011, Microbial cell factories.

[19]  J. Krömer,et al.  Fluxomics - connecting 'omics analysis and phenotypes. , 2013, Environmental microbiology.

[20]  Xueyang Feng,et al.  Invariability of central metabolic flux distribution in Shewanella oneidensis MR‐1 under environmental or genetic perturbations , 2009, Biotechnology progress.

[21]  H. Mori,et al.  Global metabolic response of Escherichia coli to gnd or zwf gene-knockout, based on 13C-labeling experiments and the measurement of enzyme activities , 2004, Applied Microbiology and Biotechnology.

[22]  G. Stephanopoulos,et al.  Elementary metabolite units (EMU): a novel framework for modeling isotopic distributions. , 2007, Metabolic engineering.

[23]  U. Sauer,et al.  Systematic evaluation of objective functions for predicting intracellular fluxes in Escherichia coli , 2007, Molecular systems biology.

[24]  Jude W. Shavlik,et al.  Knowledge-Based Artificial Neural Networks , 1994, Artif. Intell..

[25]  Nobuyoshi Ishii,et al.  13C‐metabolic flux analysis for batch culture of Escherichia coli and its pyk and pgi gene knockout mutants based on mass isotopomer distribution of intracellular metabolites , 2010, Biotechnology progress.

[26]  R. Milo,et al.  Rethinking glycolysis: on the biochemical logic of metabolic pathways. , 2012, Nature chemical biology.

[27]  L. Quek,et al.  OpenFLUX: efficient modelling software for 13C-based metabolic flux analysis , 2009, Microbial cell factories.

[28]  Sorin Draghici,et al.  Machine Learning and Its Applications to Biology , 2007, PLoS Comput. Biol..

[29]  Yinjie J. Tang,et al.  An ancient Chinese wisdom for metabolic engineering: Yin-Yang , 2015, Microbial Cell Factories.

[30]  Stephan Noack,et al.  Improved L‐lysine production with Corynebacterium glutamicum and systemic insight into citrate synthase flux and activity , 2012, Biotechnology and bioengineering.

[31]  Christopher P. Long,et al.  Integrated 13C-metabolic flux analysis of 14 parallel labeling experiments in Escherichia coli. , 2015, Metabolic engineering.

[32]  Bernhard O Palsson,et al.  Latent Pathway Activation and Increased Pathway Capacity Enable Escherichia coli Adaptation to Loss of Key Metabolic Enzymes* , 2006, Journal of Biological Chemistry.

[33]  C. Wittmann,et al.  From zero to hero--design-based systems metabolic engineering of Corynebacterium glutamicum for L-lysine production. , 2011, Metabolic engineering.

[34]  Tomislav Smuc,et al.  Enhanced analytical power of SDS‐PAGE using machine learning algorithms , 2008, Proteomics.

[35]  A. Burgard,et al.  Metabolic engineering of Escherichia coli for direct production of 1,4-butanediol. , 2011, Nature chemical biology.

[36]  C. Wittmann,et al.  Robustness and Plasticity of Metabolic Pathway Flux among Uropathogenic Isolates of Pseudomonas aeruginosa , 2014, PloS one.

[37]  M. Araúzo-Bravo,et al.  Metabolic flux analysis for a ppc mutant Escherichia coli based on 13C-labelling experiments together with enzyme activity assays and intracellular metabolite measurements. , 2004, FEMS microbiology letters.

[38]  A. Zeng,et al.  A de novo NADPH generation pathway for improving lysine production of Corynebacterium glutamicum by rational design of the coenzyme specificity of glyceraldehyde 3-phosphate dehydrogenase. , 2014, Metabolic engineering.

[39]  Katharina Nöh,et al.  Fluxome study of Pseudomonas fluorescens reveals major reorganisation of carbon flux through central metabolic pathways in response to inactivation of the anti-sigma factor MucA , 2015, BMC Systems Biology.

[40]  G. Stephanopoulos Metabolic fluxes and metabolic engineering. , 1999, Metabolic engineering.

[41]  Jeffrey D Orth,et al.  What is flux balance analysis? , 2010, Nature Biotechnology.

[42]  Yinjie J. Tang,et al.  Advances in analysis of microbial metabolic fluxes via (13)C isotopic labeling. , 2009, Mass spectrometry reviews.

[43]  U. Sauer,et al.  Convergent Peripheral Pathways Catalyze Initial Glucose Catabolism in Pseudomonas putida: Genomic and Flux Analysis , 2007, Journal of bacteriology.

[44]  Yinjie J. Tang,et al.  Analysis of metabolic pathways and fluxes in a newly discovered thermophilic and ethanol‐tolerant Geobacillus strain , 2009, Biotechnology and bioengineering.

[45]  Yinjie J. Tang,et al.  Recent advances in mapping environmental microbial metabolisms through 13C isotopic fingerprints , 2012, Journal of The Royal Society Interface.

[46]  R. Milo,et al.  Glycolytic strategy as a tradeoff between energy yield and protein cost , 2013, Proceedings of the National Academy of Sciences.

[47]  Yinjie J. Tang,et al.  Facilitate Collaborations among Synthetic Biology, Metabolic Engineering and Machine Learning , 2016 .

[48]  Uwe Sauer,et al.  The PEP-pyruvate-oxaloacetate node as the switch point for carbon flux distribution in bacteria. , 2005, FEMS microbiology reviews.

[49]  Todd,et al.  Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning , 2002, Nature Medicine.

[50]  Zhang Liu,et al.  Interior-point methods for large-scale cone programming , 2011 .

[51]  Yinjie J. Tang,et al.  Central metabolic responses to the overproduction of fatty acids in Escherichia coli based on 13C‐metabolic flux analysis , 2014, Biotechnology and bioengineering.

[52]  G. Stephanopoulos,et al.  Network rigidity and metabolic engineering in metabolite overproduction , 1991, Science.

[53]  U. Sauer,et al.  Maintenance metabolism and carbon fluxes in Bacillus species , 2008, Microbial cell factories.

[54]  U. Sauer,et al.  High-throughput metabolic flux analysis based on gas chromatography-mass spectrometry derived 13C constraints. , 2004, Analytical biochemistry.

[55]  Ralf Takors,et al.  Metabolic flux analysis at ultra short time scale: isotopically non-stationary 13C labeling experiments. , 2007, Journal of biotechnology.

[56]  Ping Wang,et al.  Metabolic flux analysis of the central carbon metabolism of the industrial vitamin B12 producing strain Pseudomonas denitrificans using 13C-labeled glucose , 2012 .

[57]  Michael Eickenberg,et al.  Machine learning for neuroimaging with scikit-learn , 2014, Front. Neuroinform..

[58]  Hirotada Mori,et al.  Effect of zwf gene knockout on the metabolism of Escherichia coli grown on glucose or acetate. , 2004, Metabolic engineering.

[59]  Gunnar Rätsch,et al.  Improving the Caenorhabditis elegans Genome Annotation Using Machine Learning , 2006, PLoS Comput. Biol..

[60]  M. Antoniewicz,et al.  COMPLETE-MFA: complementary parallel labeling experiments technique for metabolic flux analysis. , 2013, Metabolic engineering.

[61]  T. Conway,et al.  The Entner-Doudoroff pathway: history, physiology and molecular biology. , 1992, FEMS microbiology reviews.

[62]  Evangelos Simeonidis,et al.  Flux balance analysis: a geometric perspective. , 2009, Journal of theoretical biology.

[63]  Nick Wierckx,et al.  Metabolic flux analysis of a phenol producing mutant of Pseudomonas putida S12: verification and complementation of hypotheses derived from transcriptomics. , 2009, Journal of biotechnology.

[64]  Lars M. Blank,et al.  Response of Pseudomonas putida KT2440 to Increased NADH and ATP Demand , 2011, Applied and Environmental Microbiology.

[65]  D. Kell Metabolomics, modelling and machine learning in systems biology – towards an understanding of the languages of cells , 2006, The FEBS journal.

[66]  L. Blank,et al.  Metabolic response of Pseudomonas putida during redox biocatalysis in the presence of a second octanol phase , 2008, The FEBS journal.

[67]  Yinjie J. Tang,et al.  Correlation of Genomic and Physiological Traits of Thermoanaerobacter Species with Biofuel Yields , 2011, Applied and Environmental Microbiology.

[68]  Tom M. Conrad,et al.  Omic data from evolved E. coli are consistent with computed optimal growth from genome-scale models , 2010, Molecular systems biology.

[69]  Adam M. Feist,et al.  A comprehensive genome-scale reconstruction of Escherichia coli metabolism—2011 , 2011, Molecular systems biology.

[70]  U. Sauer,et al.  Large-scale in vivo flux analysis shows rigidity and suboptimal performance of Bacillus subtilis metabolism , 2005, Nature Genetics.

[71]  Bernhard O. Palsson,et al.  Gap-filling analysis of the iJO1366 Escherichia coli metabolic network reconstruction for discovery of metabolic functions , 2012, BMC Systems Biology.

[72]  Sirish L. Shah,et al.  Analysis of metabolomic data using support vector machines. , 2008, Analytical chemistry.