Fast uncertainty quantification for dynamic flux balance analysis using non-smooth polynomial chaos expansions

We present a novel surrogate modeling method that can be used to accelerate the solution of uncertainty quantification (UQ) problems arising in nonlinear and non-smooth models of biological systems. In particular, we focus on dynamic flux balance analysis (DFBA) models that couple intracellular fluxes, found from the solution of a constrained metabolic network model of the cellular metabolism, to the time-varying nature of the extracellular substrate and product concentrations. DFBA models are generally computationally expensive and present unique challenges to UQ, as they entail dynamic simulations with discrete events that correspond to switches in the active set of the solution of the constrained intracellular model. The proposed non-smooth polynomial chaos expansion (nsPCE) method is an extension of traditional PCE that can effectively capture singularities in the DFBA model response due to the occurrence of these discrete events. The key idea in nsPCE is to use a model of the singularity time to partition the parameter space into two elements on which the model response behaves smoothly. Separate PCE models are then fit in both elements using a basis-adaptive sparse regression approach that is known to scale well with respect to the number of uncertain parameters. We demonstrate the effectiveness of nsPCE on a DFBA model of an E. coli monoculture that consists of 1075 reactions and 761 metabolites. We first illustrate how traditional PCE is unable to handle problems of this level of complexity. We demonstrate that over 800-fold savings in computational cost of uncertainty propagation and Bayesian estimation of parameters in the substrate uptake kinetics can be achieved by using the nsPCE surrogates in place of the full DFBA model simulations. We then investigate the scalability of the nsPCE method by utilizing it for global sensitivity analysis and maximum a posteriori estimation in a synthetic metabolic network problem with a larger number of parameters related to both intracellular and extracellular quantities.

[1]  Hans Petter Langtangen,et al.  Multivariate Polynomial Chaos Expansions with Dependent Variables , 2018, SIAM J. Sci. Comput..

[2]  B. Sudret,et al.  Reliability analysis of high-dimensional models using low-rank tensor approximations , 2016, 1606.08577.

[3]  Wynand S. Verwoerd,et al.  ORCA: a COBRA toolbox extension for model-driven discovery and analysis , 2014, Bioinform..

[4]  F. Doyle,et al.  Dynamic flux balance analysis of diauxic growth in Escherichia coli. , 2002, Biophysical journal.

[5]  A. Gelfand,et al.  Identifiability, Improper Priors, and Gibbs Sampling for Generalized Linear Models , 1999 .

[6]  Ronan M. T. Fleming,et al.  Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0 , 2007, Nature Protocols.

[7]  G. Karniadakis,et al.  An adaptive multi-element generalized polynomial chaos method for stochastic differential equations , 2005 .

[8]  K. Chaloner,et al.  Bayesian Experimental Design: A Review , 1995 .

[9]  Ralph C. Smith,et al.  Uncertainty Quantification: Theory, Implementation, and Applications , 2013 .

[10]  Michael A Henson,et al.  Optimization of Fed‐Batch Saccharomyces cerevisiae Fermentation Using Dynamic Flux Balance Models , 2006, Biotechnology progress.

[11]  Paul I. Barton,et al.  Efficient solution of ordinary differential equations with a parametric lexicographic linear program embedded , 2015, Numerische Mathematik.

[12]  Joel A. Paulson,et al.  Nonlinear Model Predictive Control with Explicit Backoffs for Stochastic Systems under Arbitrary Uncertainty , 2018 .

[13]  Xun Huan,et al.  Simulation-based optimal Bayesian experimental design for nonlinear systems , 2011, J. Comput. Phys..

[14]  W. Näther Optimum experimental designs , 1994 .

[15]  Markus J. Herrgård,et al.  Integrating high-throughput and computational data elucidates bacterial networks , 2004, Nature.

[16]  Alireza Doostan,et al.  Coherence motivated sampling and convergence analysis of least squares polynomial Chaos regression , 2014, 1410.1931.

[17]  B. Palsson,et al.  An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR) , 2003, Genome Biology.

[18]  J. Heijnen,et al.  A metabolic network stoichiometry analysis of microbial growth and product formation , 1995, Biotechnology and bioengineering.

[19]  Ali Mesbah,et al.  Discrimination Between Competing Model Structures of Biological Systems in the Presence of Population Heterogeneity , 2016, IEEE Life Sciences Letters.

[20]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[21]  Vassilios S. Vassiliadis,et al.  Simulation and optimization of dynamic flux balance analysis models using an interior point method reformulation , 2018, Comput. Chem. Eng..

[22]  Bruno Sudret,et al.  Using sparse polynomial chaos expansions for the global sensitivity analysis of groundwater lifetime expectancy in a multi-layered hydrogeological model , 2015, Reliab. Eng. Syst. Saf..

[23]  Jeffrey D Orth,et al.  What is flux balance analysis? , 2010, Nature Biotechnology.

[24]  Joel A. Paulson,et al.  Optimal Bayesian experiment design for nonlinear dynamic systems with chance constraints , 2019, Journal of Process Control.

[25]  Adam L. Meadows,et al.  Application of dynamic flux balance analysis to an industrial Escherichia coli fermentation. , 2010, Metabolic engineering.

[26]  Stefano Marelli,et al.  Extending classical surrogate modelling to ultrahigh dimensional problems through supervised dimensionality reduction: a data-driven approach , 2018, ArXiv.

[27]  Joel A. Paulson,et al.  Arbitrary Polynomial Chaos for Uncertainty Propagation of Correlated Random Variables in Dynamic Systems , 2017 .

[28]  Paul I. Barton,et al.  DFBAlab: a fast and reliable MATLAB code for dynamic flux balance analysis , 2014, BMC Bioinformatics.

[29]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[30]  Raul Tempone,et al.  Fast estimation of expected information gains for Bayesian experimental designs based on Laplace approximations , 2013 .

[31]  Stefano Marelli,et al.  UQLab: a framework for uncertainty quantification in MATLAB , 2014 .

[32]  Gareth W. Peters,et al.  An Overview of Recent Advances in Monte-Carlo Methods for Bayesian Filtering in High-Dimensional Spaces , 2015 .

[33]  Stephen P. Brooks,et al.  Markov chain Monte Carlo method and its application , 1998 .

[34]  M. Rosenblatt Remarks on a Multivariate Transformation , 1952 .

[35]  Omar Ghattas,et al.  A Randomized Maximum A Posteriori Method for Posterior Sampling of High Dimensional Nonlinear Bayesian Inverse Problems , 2016, SIAM J. Sci. Comput..

[36]  Ursula Klingmüller,et al.  Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood , 2009, Bioinform..

[37]  W. Gautschi On Generating Orthogonal Polynomials , 1982 .

[38]  Wolfgang Wiechert,et al.  Dynamic flux balance analysis with nonlinear objective function , 2017, Journal of Mathematical Biology.

[39]  Ronan M. T. Fleming,et al.  Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0 , 2007, Nature Protocols.

[40]  Bruno Sudret,et al.  Adaptive sparse polynomial chaos expansion based on least angle regression , 2011, J. Comput. Phys..

[41]  P I Barton,et al.  A reliable simulator for dynamic flux balance analysis , 2013, Biotechnology and bioengineering.

[42]  D. Xiu Efficient collocational approach for parametric uncertainty analysis , 2007 .

[43]  Nando de Freitas,et al.  An Introduction to Sequential Monte Carlo Methods , 2001, Sequential Monte Carlo Methods in Practice.

[44]  Darren J. Wilkinson,et al.  Bayesian methods in bioinformatics and computational systems biology , 2006, Briefings Bioinform..

[45]  Roger G. Ghanem,et al.  Physical Systems with Random Uncertainties: Chaos Representations with Arbitrary Probability Measure , 2005, SIAM J. Sci. Comput..

[46]  G. Karniadakis,et al.  Multi-Element Generalized Polynomial Chaos for Arbitrary Probability Measures , 2006, SIAM J. Sci. Comput..

[47]  P. Bickel,et al.  Obstacles to High-Dimensional Particle Filtering , 2008 .

[48]  J. Banga,et al.  Structural Identifiability of Systems Biology Models: A Critical Comparison of Methods , 2011, PloS one.

[49]  D. Sarkar,et al.  Dynamic flux balance analysis of batch fermentation: effect of genetic manipulations on ethanol production , 2014, Bioprocess and Biosystems Engineering.

[50]  Edward J. O'Brien,et al.  Using Genome-scale Models to Predict Biological Capabilities , 2015, Cell.

[51]  Stefan Streif,et al.  A Probabilistic Approach to Robust Optimal Experiment Design with Chance Constraints , 2014, ArXiv.

[52]  D. Xiu Fast numerical methods for stochastic computations: A review , 2009 .

[53]  Annette M. Molinaro,et al.  Prediction error estimation: a comparison of resampling methods , 2005, Bioinform..

[54]  A. O'Hagan,et al.  Bayesian calibration of computer models , 2001 .

[55]  Dongbin Xiu,et al.  Parameter uncertainty quantification using surrogate models applied to a spatial model of yeast mating polarization , 2018, PLoS Comput. Biol..

[56]  L. Tierney,et al.  Accurate Approximations for Posterior Moments and Marginal Densities , 1986 .

[57]  B. Rannala,et al.  The Bayesian revolution in genetics , 2004, Nature Reviews Genetics.

[58]  Habib N. Najm,et al.  Dimensionality reduction and polynomial chaos acceleration of Bayesian inference in inverse problems , 2008, J. Comput. Phys..

[59]  Anthony N. Pettitt,et al.  A Review of Modern Computational Algorithms for Bayesian Optimal Design , 2016 .

[60]  R. Ghanem,et al.  Stochastic Finite Elements: A Spectral Approach , 1990 .

[61]  N. Chopin A sequential particle filter method for static models , 2002 .

[62]  Michael Sinsbeck,et al.  AN OPTIMAL SAMPLING RULE FOR NONINTRUSIVE POLYNOMIAL CHAOS EXPANSIONS OF EXPENSIVE MODELS , 2015 .

[63]  Jens Timmer,et al.  Joining forces of Bayesian and frequentist methodology: a study for inference in the presence of non-identifiability , 2012, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[64]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[65]  A. Tikhonov,et al.  Numerical Methods for the Solution of Ill-Posed Problems , 1995 .

[66]  Jun S. Liu,et al.  Sequential Monte Carlo methods for dynamic systems , 1997 .

[67]  Timothy J. Hanly,et al.  Dynamic flux balance modeling of microbial co‐cultures for efficient batch fermentation of glucose and xylose mixtures , 2011, Biotechnology and bioengineering.

[68]  Richard D. Braatz,et al.  Optimal Experimental Design for Probabilistic Model Discrimination Using Polynomial Chaos , 2014 .

[69]  Michael S. Eldred,et al.  Sparse Pseudospectral Approximation Method , 2011, 1109.2936.

[70]  Joel A. Paulson,et al.  An efficient method for stochastic optimal control with joint chance constraints for nonlinear systems , 2019 .

[71]  Dongbin Xiu,et al.  The Wiener-Askey Polynomial Chaos for Stochastic Differential Equations , 2002, SIAM J. Sci. Comput..

[72]  Julio R. Banga,et al.  Novel metaheuristic for parameter estimation in nonlinear dynamic biological systems , 2006, BMC Bioinformatics.