Scalable nonlinear programming framework for parameter estimation in dynamic biological system models

We present a nonlinear programming (NLP) framework for the scalable solution of parameter estimation problems that arise in dynamic modeling of biological systems. Such problems are computationally challenging because they often involve highly nonlinear and stiff differential equations as well as many experimental data sets and parameters. The proposed framework uses cutting-edge modeling and solution tools which are computationally efficient, robust, and easy-to-use. Specifically, our framework uses a time discretization approach that: i) avoids repetitive simulations of the dynamic model, ii) enables fully algebraic model implementations and computation of derivatives, and iii) enables the use of computationally efficient nonlinear interior point solvers that exploit sparse and structured linear algebra techniques. We demonstrate these capabilities by solving estimation problems for synthetic human gut microbiome community models. We show that an instance with 156 parameters, 144 differential equations, and 1,704 experimental data points can be solved in less than 3 minutes using our proposed framework (while an off-the-shelf simulation-based solution framework requires over 7 hours). We also create large instances to show that the proposed framework is scalable and can solve problems with up to 2,352 parameters, 2,304 differential equations, and 20,352 data points in less than 15 minutes. The proposed framework is flexible and easy-to-use, can be broadly applied to dynamic models of biological systems, and enables the implementation of sophisticated estimation techniques to quantify parameter uncertainty, to diagnose observability/uniqueness issues, to perform model selection, and to handle outliers.

[1]  L. Biegler,et al.  A Moving Horizon Estimator for processes with multi-rate measurements: A Nonlinear Programming sensitivity approach , 2012 .

[2]  D. Lauffenburger,et al.  Input–output behavior of ErbB signaling pathways as revealed by a mass action model trained against dynamic data , 2009, Molecular systems biology.

[3]  T. Brubaker,et al.  Nonlinear Parameter Estimation , 1979 .

[4]  L. Biegler An overview of simultaneous strategies for dynamic optimization , 2007 .

[5]  Albert C. Reynolds,et al.  Quantifying Uncertainty for the PUNQ-S3 Problem in a Bayesian Setting With RML and EnKF , 2005 .

[6]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[7]  L. Biegler,et al.  Decomposition algorithms for on-line estimation with nonlinear DAE models , 1995 .

[8]  Fabian J. Theis,et al.  Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks , 2016, bioRxiv.

[9]  Katherine H. Huang,et al.  Structure, Function and Diversity of the Healthy Human Microbiome , 2012, Nature.

[10]  Omar Ghattas,et al.  A Randomized Maximum A Posteriori Method for Posterior Sampling of High Dimensional Nonlinear Bayesian Inverse Problems , 2016, SIAM J. Sci. Comput..

[11]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[12]  Douglas B. Kell,et al.  Non-linear optimization of biochemical pathways: applications to metabolic engineering and parameter estimation , 1998, Bioinform..

[13]  Heikki Haario,et al.  Randomize-Then-Optimize: A Method for Sampling from Posterior Distributions in Nonlinear Inverse Problems , 2014, SIAM J. Sci. Comput..

[14]  Mustafa Khammash,et al.  Parameter Estimation and Model Selection in Computational Biology , 2010, PLoS Comput. Biol..

[15]  Lorenz T. Biegler,et al.  Optimal sensitivity based on IPOPT , 2012, Math. Program. Comput..

[16]  Jiahua Chen,et al.  Extended Bayesian information criteria for model selection with large model spaces , 2008 .

[17]  Victor M. Zavala,et al.  A graph-based computational framework for simulation and optimisation of coupled infrastructure networks , 2017 .

[18]  L. Lasdon,et al.  Efficient data reconciliation and estimation for dynamic processes using nonlinear programming techniques , 1992 .

[19]  L. Biegler,et al.  Nonlinear Programming Strategies for State Estimation and Model Predictive Control , 2009 .

[20]  P. Mendes,et al.  Systematic Construction of Kinetic Models from Genome-Scale Metabolic Networks , 2013, PloS one.

[21]  Jonathan M. Garibaldi,et al.  Parameter Estimation Using Metaheuristics in Systems Biology: A Comprehensive Review , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[22]  Jonathan Friedman,et al.  Ecological systems biology: The dynamics of interacting populations , 2017 .

[23]  Iain Dunning,et al.  JuMP: A Modeling Language for Mathematical Optimization , 2015, SIAM Rev..

[24]  David F Anderson,et al.  Comparison of finite difference based methods to obtain sensitivities of stochastic chemical kinetic models. , 2013, The Journal of chemical physics.

[25]  Victor M. Zavala,et al.  Clustering-based preconditioning for stochastic programs , 2016, Comput. Optim. Appl..

[26]  Yonathan Bard,et al.  Nonlinear parameter estimation , 1974 .

[27]  M. Wells,et al.  Variations and Fluctuations of the Number of Individuals in Animal Species living together , 2006 .

[28]  Xin-She Yang,et al.  Nature-Inspired Metaheuristic Algorithms , 2008 .

[29]  Jonathan R. Karr,et al.  A Whole-Cell Computational Model Predicts Phenotype from Genotype , 2012, Cell.

[30]  P. Swain,et al.  Mechanistic links between cellular trade-offs, gene expression, and growth , 2015, Proceedings of the National Academy of Sciences.

[31]  Luís N. Vicente,et al.  A particle swarm pattern search method for bound constrained global optimization , 2007, J. Glob. Optim..

[32]  G. Kitagawa,et al.  Information Criteria and Statistical Modeling , 2007 .

[33]  Hiroaki Kitano,et al.  The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models , 2003, Bioinform..

[34]  Ophelia S. Venturelli,et al.  Deciphering microbial interactions in synthetic human gut microbiome communities , 2017, bioRxiv.

[35]  Victor M. Zavala,et al.  Computational strategies for the optimal operation of large-scale chemical processes , 2008 .

[36]  Albert C. Reynolds,et al.  Investigation of the sampling performance of ensemble-based methods with a simple reservoir model , 2013, Computational Geosciences.

[37]  Lorenz T. Biegler,et al.  Parallel cyclic reduction decomposition for dynamic optimization problems , 2019, Comput. Chem. Eng..

[38]  L. Biegler,et al.  Simultaneous solution and optimization strategies for parameter estimation of differential-algebraic equation systems , 1991 .

[39]  Gene H. Golub,et al.  Matrix computations , 1983 .

[40]  S. E. Ahmed,et al.  Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference , 2008, Technometrics.

[41]  Colin Fontaine,et al.  Stability of Ecological Communities and the Architecture of Mutualistic and Trophic Networks , 2010, Science.

[42]  C. Tropini,et al.  The Gut Microbiome: Connecting Spatial Organization to Function. , 2017, Cell host & microbe.

[43]  Dean S. Oliver,et al.  Metropolized Randomized Maximum Likelihood for Improved Sampling from Multimodal Distributions , 2015, SIAM/ASA J. Uncertain. Quantification.

[44]  Justin L Sonnenburg,et al.  Quantitative Imaging of Gut Microbiota Spatial Organization. , 2015, Cell host & microbe.

[45]  James Martin,et al.  A Computational Framework for Infinite-Dimensional Bayesian Inverse Problems, Part II: Stochastic Newton MCMC with Application to Ice Sheet Flow Inverse Problems , 2013, SIAM J. Sci. Comput..

[46]  Lorenz T. Biegler,et al.  Parameter estimation in metabolic flux balance models for batch fermentation—Formulation & Solution using Differential Variational Inequalities (DVIs) , 2006, Ann. Oper. Res..

[47]  Victor M. Zavala,et al.  Interior-point decomposition approaches for parallel solution of large-scale nonlinear parameter estimation problems , 2008 .

[48]  Gunnar Rätsch,et al.  Ecological Modeling from Time-Series Inference: Insight into Dynamics and Stability of Intestinal Microbiota , 2013, PLoS Comput. Biol..

[49]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[50]  Li Xie,et al.  Lotka-Volterra pairwise modeling fails to capture diverse pairwise microbial interactions , 2017, eLife.

[51]  David Welch,et al.  Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems , 2009, Journal of The Royal Society Interface.

[52]  C. Bottasso,et al.  Optimal Control of Multibody Systems Using an Energy Preserving Direct Transcription Method , 2004 .

[53]  Victor M. Zavala,et al.  Optimization-based strategies for the operation of low-density polyethylene tubular reactors: Moving horizon estimation , 2009, Comput. Chem. Eng..

[54]  R. Rockafellar,et al.  Optimization of conditional value-at risk , 2000 .

[55]  Nikolaos Anesiadis,et al.  Engineering metabolism through dynamic control. , 2015, Current opinion in biotechnology.

[56]  Orkun S. Soyer,et al.  Challenges in microbial ecology: building predictive understanding of community function and dynamics , 2016, The ISME Journal.

[57]  Carmen G. Moles,et al.  Parameter estimation in biochemical pathways: a comparison of global optimization methods. , 2003, Genome research.

[58]  A. J. Lotka Elements of Physical Biology. , 1925, Nature.

[59]  Victor M. Zavala,et al.  A Computational Framework for Identifiability and Ill-Conditioning Analysis of Lithium-Ion Battery Models , 2016 .

[60]  Eva Balsa-Canto,et al.  Hybrid optimization method with general switching strategy for parameter estimation , 2008, BMC Systems Biology.

[61]  A. Tikhonov,et al.  Numerical Methods for the Solution of Ill-Posed Problems , 1995 .

[62]  Derek N. Macklin,et al.  The future of whole-cell modeling. , 2014, Current opinion in biotechnology.

[63]  R. Arditi,et al.  Microbial Interactions within a Cheese Microbial Community , 2007, Applied and Environmental Microbiology.

[64]  Ophelia S. Venturelli,et al.  Population Diversification in a Yeast Metabolic Program Promotes Anticipation of Environmental Shifts , 2014, bioRxiv.

[65]  Lorenz T. Biegler,et al.  Parameter Estimation in Batch Bioreactor Simulation Using Metabolic Models: Sequential Solution with Direct Sensitivities , 2011 .

[66]  John T. Betts,et al.  Application of Direct Transcription to Commercial Aircraft Trajectory Optimization , 1995 .

[67]  Stan Uryasev,et al.  CVaR norm and applications in optimization , 2014, Optim. Lett..

[68]  Masahiro Okamoto,et al.  Efficient Numerical Optimization Algorithm Based on Genetic Algorithm for Inverse Problem , 2000, GECCO.

[69]  Lorenz T. Biegler,et al.  On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming , 2006, Math. Program..

[70]  R. Rockafellar,et al.  Conditional Value-at-Risk for General Loss Distributions , 2001 .

[71]  David L. Woodruff,et al.  Pyomo: modeling and solving mathematical programs in Python , 2011, Math. Program. Comput..

[72]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[73]  Dean S. Oliver,et al.  Metropolized Randomized Maximum Likelihood for sampling from multimodal distributions , 2015 .

[74]  Jorge Nocedal,et al.  Knitro: An Integrated Package for Nonlinear Optimization , 2006 .

[75]  Maksat Ashyraliyev,et al.  Systems biology: parameter estimation for biochemical models , 2009, The FEBS journal.

[76]  S. Chib,et al.  Understanding the Metropolis-Hastings Algorithm , 1995 .

[77]  Dean S. Oliver,et al.  Conditioning Permeability Fields to Pressure Data , 1996 .

[78]  Michael P. H. Stumpf,et al.  Simulation-based model selection for dynamical systems in systems and population biology , 2009, Bioinform..

[79]  David L. Woodruff,et al.  Pyomo — Optimization Modeling in Python , 2012, Springer Optimization and Its Applications.

[80]  Victor M. Zavala,et al.  Nonlinear programming strategies on high-performance computers , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[81]  Masaru Tomita,et al.  Dynamic modeling of genetic networks using genetic algorithm and S-system , 2003, Bioinform..

[82]  Lorenz T. Biegler,et al.  Nonlinear Waves in Integrable and Nonintegrable Systems , 2018 .

[83]  Kim B. McAuley,et al.  Mathematical modelling of chemical processes—obtaining the best model predictions and parameter estimates using identifiability and estimability procedures , 2012 .

[84]  Catherine M Lloyd,et al.  CellML: its future, present and past. , 2004, Progress in biophysics and molecular biology.

[85]  Fabian J Theis,et al.  Lessons Learned from Quantitative Dynamical Modeling in Systems Biology , 2013, PloS one.