The ETFL formulation allows multi-omics integration in thermodynamics-compliant metabolism and expression models

Systems biology has long been interested in models capturing both metabolism and expression in a cell. We propose here an implementation of the metabolism and expression model formalism (ME-models), which we call ETFL, for Expression and Thermodynamics Flux models. ETFL is a hierarchical model formulation, from metabolism to RNA synthesis, that allows simulating thermodynamics-compliant intracellular fluxes as well as enzyme and mRNA concentration levels. ETFL formulates a mixed-integer linear problem (MILP) that enables both relative and absolute metabolite, protein, and mRNA concentration integration. ETFL is compatible with standard MILP solvers and does not require a non-linear solver, unlike the previous state of the art. It also accounts for growth-dependent parameters, such as relative protein or mRNA content. We present ETFL along with its validation using results obtained from a well-characterized E. coli model. We show that ETFL is able to reproduce proteome-limited growth. We also subject it to several analyses, including the prediction of feasible mRNA and enzyme concentrations and gene essentiality. Accounting for the effects of genetic expression in genome-scale metabolic models is challenging. Here, the authors introduce a model formulation that efficiently simulates thermodynamic-compliant fluxes, enzyme and mRNA concentration levels, allowing omics integration and broad analysis of in silico cellular physiology.

[1]  Ronan M. T. Fleming,et al.  Generation of genome-scale metabolic reconstructions for 773 members of the human gut microbiota , 2016, Nature Biotechnology.

[2]  Adam M. Feist,et al.  A model‐driven quantitative metabolomics analysis of aerobic and anaerobic metabolism in E. coli K‐12 MG1655 that is biochemically and thermodynamically consistent , 2014, Biotechnology and bioengineering.

[3]  Rina Dechter,et al.  Benchmark on DAOOPT and GUROBI with the PASCAL 2 Inference Challenge Problems , 2013 .

[4]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[5]  Jan Schellenberger,et al.  Use of Randomized Sampling for Analysis of Metabolic Networks* , 2009, Journal of Biological Chemistry.

[6]  M. Huynen,et al.  optGpSampler: An Improved Tool for Uniformly Sampling the Solution-Space of Genome-Scale Metabolic Networks , 2014, PloS one.

[7]  G. Church,et al.  Analysis of optimality in natural and perturbed metabolic networks , 2002 .

[8]  Vassily Hatzimanikatis,et al.  Constraining the flux space using thermodynamics and integration of metabolomics data. , 2014, Methods in molecular biology.

[9]  Peter D. Karp,et al.  The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases , 2007, Nucleic Acids Res..

[10]  Lei Shi,et al.  SABIO-RK—database for biochemical reaction kinetics , 2011, Nucleic Acids Res..

[11]  Brian M. Hopkinson,et al.  Sizing up metatranscriptomics , 2012, The ISME Journal.

[12]  Vassily Hatzimanikatis,et al.  pyTFA and matTFA: a Python package and a Matlab toolbox for Thermodynamics-based Flux Analysis , 2018, Bioinform..

[13]  Edward J. O'Brien,et al.  Genome-scale models of metabolism and gene expression extend and refine growth phenotype prediction , 2013, Molecular systems biology.

[14]  F. Neidhardt The regulation RNA synthesis in bacteria. , 1964, Progress in nucleic acid research and molecular biology.

[15]  Ronan M. T. Fleming,et al.  Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0 , 2007, Nature Protocols.

[16]  J. Frank,et al.  Three-dimensional reconstruction with contrast transfer function correction from energy-filtered cryoelectron micrographs: procedure and application to the 70S Escherichia coli ribosome. , 1997, Journal of structural biology.

[17]  M. A. de Menezes,et al.  Intracellular crowding defines the mode and sequence of substrate uptake by Escherichia coli and constrains its metabolic activity , 2007, Proceedings of the National Academy of Sciences.

[18]  J. Holton,et al.  Structures of the Bacterial Ribosome at 3.5 Å Resolution , 2005, Science.

[19]  Costas D. Maranas,et al.  SteadyCom: Predicting microbial abundances while ensuring community stability , 2017, PLoS Comput. Biol..

[20]  Frederick C. Neidhardt,et al.  Escherichia coli and Salmonella :cellular and molecular biology , 2016 .

[21]  Robert J. C. Gilbert Physical biology of the cell, by Rob Phillips, Jane Kondev and Julie Theriot , 2009 .

[22]  A G Fredrickson,et al.  Formulation of structured growth models. , 2000, Biotechnology and bioengineering.

[23]  Eytan Ruppin,et al.  iMAT: an integrative metabolic analysis tool , 2010, Bioinform..

[24]  Nikolaus Sonnenschein,et al.  Optlang: An algebraic modeling language for mathematical optimization , 2017, J. Open Source Softw..

[25]  Eytan Ruppin,et al.  Flux balance analysis accounting for metabolite dilution , 2010, Genome Biology.

[26]  Matthew D. Jankowski,et al.  Group contribution method for thermodynamic analysis of complex metabolic networks. , 2008, Biophysical journal.

[27]  R. Milo,et al.  Global characterization of in vivo enzyme catalytic rates and their correspondence to in vitro kcat measurements , 2016, Proceedings of the National Academy of Sciences.

[28]  Peter D. Karp,et al.  EcoCyc: a comprehensive database resource for Escherichia coli , 2004, Nucleic Acids Res..

[29]  W. Austin Elam,et al.  Physical Biology of the Cell , 2014, The Yale Journal of Biology and Medicine.

[30]  Griffin M. Weber,et al.  BioNumbers—the database of key numbers in molecular and cell biology , 2009, Nucleic Acids Res..

[31]  J. Bailey,et al.  Optimization of regulatory architectures in metabolic reaction networks , 1996, Biotechnology and bioengineering.

[32]  Benjamín J. Sánchez,et al.  Improving the phenotype predictions of a yeast genome‐scale metabolic model by incorporating enzymatic constraints , 2017, Molecular systems biology.

[33]  Jeffrey D. Orth,et al.  In silico method for modelling metabolism and gene product expression at genome scale , 2012, Nature Communications.

[34]  R. Milo,et al.  Visual account of protein investment in cellular functions , 2014, Proceedings of the National Academy of Sciences.

[35]  V. Hatzimanikatis,et al.  Thermodynamics-based metabolic flux analysis. , 2007, Biophysical journal.

[36]  F. Glover IMPROVED LINEAR INTEGER PROGRAMMING FORMULATIONS OF NONLINEAR INTEGER PROBLEMS , 1975 .

[37]  Bernhard O. Palsson,et al.  Escher: A Web Application for Building, Sharing, and Embedding Data-Rich Visualizations of Biological Pathways , 2015, PLoS Comput. Biol..

[38]  Michael A. Saunders,et al.  solveME: fast and reliable solution of nonlinear ME models , 2016, BMC Bioinformatics.

[39]  Joshua A. Lerman,et al.  COBRApy: COnstraints-Based Reconstruction and Analysis for Python , 2013, BMC Systems Biology.

[40]  H. Bremer Modulation of Chemical Composition and Other Parameters of the Cell by Growth Rate , 1999 .

[41]  Ronan M. T. Fleming,et al.  Reliable and efficient solution of genome-scale models of Metabolism and macromolecular Expression , 2016, Scientific Reports.

[42]  F. Doyle,et al.  Dynamic flux balance analysis of diauxic growth in Escherichia coli. , 2002, Biophysical journal.

[43]  J. Bernhardt,et al.  Systems-wide temporal proteomic profiling in glucose-starved Bacillus subtilis , 2010, Nature communications.

[44]  J. Keasling,et al.  Stoichiometric model of Escherichia coli metabolism: incorporation of growth-rate dependent biomass composition and mechanistic energy requirements. , 1997, Biotechnology and bioengineering.

[45]  Bernhard O. Palsson,et al.  Context-Specific Metabolic Networks Are Consistent with Experiments , 2008, PLoS Comput. Biol..

[46]  V. Hatzimanikatis,et al.  Enhanced flux prediction by integrating relative expression and relative metabolite abundance into thermodynamically consistent metabolic models , 2018, bioRxiv.

[47]  Rick L. Stevens,et al.  KBase: The United States Department of Energy Systems Biology Knowledgebase , 2018, Nature Biotechnology.

[48]  Edward J. O'Brien,et al.  COBRAme: A computational framework for genome-scale models of metabolism and gene expression , 2017, bioRxiv.

[49]  Adam M. Feist,et al.  A comprehensive genome-scale reconstruction of Escherichia coli metabolism—2011 , 2011, Molecular systems biology.

[50]  Andrea Lodi,et al.  Performance Variability in Mixed-Integer Programming , 2013 .

[51]  Arkady B. Khodursky,et al.  Global analysis of mRNA decay and abundance in Escherichia coli at single-gene resolution using two-color fluorescent DNA microarrays , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[52]  V. Hatzimanikatis,et al.  Investigating the deregulation of metabolic tasks via Minimum Network Enrichment Analysis (MiNEA) as applied to nonalcoholic fatty liver disease using mouse and human omics data , 2018, bioRxiv.

[53]  Peter D. Karp,et al.  Groups: knowledge spreadsheets for symbolic biocomputing , 2013, Database J. Biol. Databases Curation.

[54]  J. Bailey,et al.  Analysis and design of metabolic reaction networks via mixed‐integer linear optimization , 1996 .