Parameter identifiability analysis and visualization in large-scale kinetic models of biosystems

BackgroundKinetic models of biochemical systems usually consist of ordinary differential equations that have many unknown parameters. Some of these parameters are often practically unidentifiable, that is, their values cannot be uniquely determined from the available data. Possible causes are lack of influence on the measured outputs, interdependence among parameters, and poor data quality. Uncorrelated parameters can be seen as the key tuning knobs of a predictive model. Therefore, before attempting to perform parameter estimation (model calibration) it is important to characterize the subset(s) of identifiable parameters and their interplay. Once this is achieved, it is still necessary to perform parameter estimation, which poses additional challenges.MethodsWe present a methodology that (i) detects high-order relationships among parameters, and (ii) visualizes the results to facilitate further analysis. We use a collinearity index to quantify the correlation between parameters in a group in a computationally efficient way. Then we apply integer optimization to find the largest groups of uncorrelated parameters. We also use the collinearity index to identify small groups of highly correlated parameters. The results files can be visualized using Cytoscape, showing the identifiable and non-identifiable groups of parameters together with the model structure in the same graph.ResultsOur contributions alleviate the difficulties that appear at different stages of the identifiability analysis and parameter estimation process. We show how to combine global optimization and regularization techniques for calibrating medium and large scale biological models with moderate computation times. Then we evaluate the practical identifiability of the estimated parameters using the proposed methodology. The identifiability analysis techniques are implemented as a MATLAB toolbox called VisId, which is freely available as open source from GitHub (https://github.com/gabora/visid).ConclusionsOur approach is geared towards scalability. It enables the practical identifiability analysis of dynamic models of large size, and accelerates their calibration. The visualization tool allows modellers to detect parts that are problematic and need refinement or reformulation, and provides experimentalists with information that can be helpful in the design of new experiments.

[1]  Juergen Hahn,et al.  Parameter set selection for estimation of nonlinear dynamic systems , 2007 .

[2]  Antonis Papachristodoulou,et al.  Structural Identifiability of Dynamic Systems Biology Models , 2016, PLoS Comput. Biol..

[3]  J. Banga,et al.  Structural Identifiability of Systems Biology Models: A Critical Comparison of Methods , 2011, PloS one.

[4]  Derek N. Macklin,et al.  The future of whole-cell modeling. , 2014, Current opinion in biotechnology.

[5]  John E. Dennis,et al.  An Adaptive Nonlinear Least-Squares Algorithm , 1977, TOMS.

[6]  V. Hatzimanikatis,et al.  Rites of passage: requirements and standards for building kinetic models of metabolic phenotypes. , 2015, Current opinion in biotechnology.

[7]  P. I. Barton,et al.  Global methods for dynamic optimization and mixed-integer dynamic optimization , 2006 .

[8]  Lennart Ljung,et al.  Convexity issues in system identification , 2013, 2013 10th IEEE International Conference on Control and Automation (ICCA).

[9]  Pedro Evangelista,et al.  Novel approaches for dynamic modelling of E. coli and their application in Metabolic Engineering , 2016 .

[10]  Eva Balsa-Canto,et al.  Parameter estimation and optimal experimental design. , 2008, Essays in biochemistry.

[11]  Eva Balsa-Canto,et al.  A consensus approach for estimating the predictive accuracy of dynamic models in biology , 2015, Comput. Methods Programs Biomed..

[12]  Stefan Weijers,et al.  A procedure for selecting best identifiable parameters in calibrating activated sludge model no. 1 to full-scale plant data , 1997 .

[13]  James W. Taylor,et al.  Global dynamic optimization for parameter estimation in chemical kinetics. , 2006, The journal of physical chemistry. A.

[14]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[15]  Jonathan R. Karr,et al.  The principles of whole-cell modeling. , 2015, Current opinion in microbiology.

[16]  William W. Chen,et al.  Classic and contemporary approaches to modeling biochemical reactions. , 2010, Genes & development.

[17]  P. Mendes,et al.  Large-Scale Metabolic Models: From Reconstruction to Differential Equations , 2013 .

[18]  Jonathan M. Garibaldi,et al.  Parameter Estimation Using Metaheuristics in Systems Biology: A Comprehensive Review , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[19]  Maksat Ashyraliyev,et al.  Systems biology: parameter estimation for biochemical models , 2009, The FEBS journal.

[20]  Gaudenz Danuser,et al.  Linking data to models: data regression , 2006, Nature Reviews Molecular Cell Biology.

[21]  Miroslav Fikar,et al.  Global optimization for parameter estimation of differential-algebraic systems , 2009 .

[22]  Douglas M. Hawkins,et al.  The Problem of Overfitting , 2004, J. Chem. Inf. Model..

[23]  Dagmar Iber,et al.  Analyzing and constraining signaling networks: parameter estimation for the user. , 2012, Methods in molecular biology.

[24]  Stefano Tarantola,et al.  Sensitivity Analysis as an Ingredient of Modeling , 2000 .

[25]  Jonathan R. Karr,et al.  A Whole-Cell Computational Model Predicts Phenotype from Genotype , 2012, Cell.

[26]  Günter Wozny,et al.  Nonlinear ill-posed problem analysis in model-based parameter estimation and experimental design , 2015, Comput. Chem. Eng..

[27]  C. Chassagnole,et al.  Dynamic modeling of the central carbon metabolism of Escherichia coli. , 2002, Biotechnology and bioengineering.

[28]  Christian H. Bischof,et al.  Algorithm 782: codes for rank-revealing QR factorizations of dense matrices , 1998, TOMS.

[29]  I. Chou,et al.  Recent developments in parameter estimation and structure identification of biochemical and genomic systems. , 2009, Mathematical biosciences.

[30]  Pu Li,et al.  A simple method for identifying parameter correlations in partially observed linear dynamic models , 2015, BMC Systems Biology.

[31]  Filippo Menolascina,et al.  Engineering and control of biological systems: A new way to tackle complex diseases , 2012, FEBS letters.

[32]  Shaohua Wu,et al.  Mean-Squared-Error Methods for Selecting Optimal Parameter Subsets for Estimation , 2012 .

[33]  T. Turányi Sensitivity analysis of complex kinetic systems. Tools and applications , 1990 .

[34]  U. Sauer,et al.  Advancing metabolic models with kinetic information. , 2014, Current opinion in biotechnology.

[35]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[36]  A. Saltelli,et al.  Sensitivity Anaysis as an Ingredient of Modeling , 2000 .

[37]  William R Cluett,et al.  Constructing kinetic models of metabolism at genome‐scales: A review , 2015, Biotechnology journal.

[38]  Julio R. Banga,et al.  Robust and efficient parameter estimation in dynamic models of biological systems , 2015, BMC Systems Biology.

[39]  Costas Kravaris,et al.  Advances and selected recent developments in state and parameter estimation , 2013, Comput. Chem. Eng..

[40]  Wolfgang Wiechert,et al.  Mechanistic pathway modeling for industrial biotechnology: challenging but worthwhile. , 2011, Current opinion in biotechnology.

[41]  Pu Li,et al.  Identification of parameter correlations for parameter estimation in dynamic biological models , 2013, BMC Systems Biology.

[42]  Julio R. Banga,et al.  An evolutionary method for complex-process optimization , 2010, Comput. Oper. Res..

[43]  Mark A. Lukas,et al.  Comparing parameter choice methods for regularization of ill-posed problems , 2011, Math. Comput. Simul..

[44]  Hong Sun,et al.  Smolign: A Spatial Motifs-Based Protein Multiple Structural Alignment Method , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[45]  Alejandro F. Villaverde,et al.  Identifiability of large nonlinear biochemical networks , 2016 .

[46]  Julio R. Banga,et al.  SensSB: a software toolbox for the development and sensitivity analysis of systems biology models , 2010, Bioinform..

[47]  Tomasz Lipniacki,et al.  Clustering reveals limits of parameter identifiability in multi-parameter models of biochemical dynamics , 2015, BMC Systems Biology.

[48]  Maria Rodriguez-Fernandez,et al.  A hybrid approach for efficient and robust parameter estimation in biochemical pathways. , 2006, Bio Systems.

[49]  Eric Walter,et al.  Identification of Parametric Models: from Experimental Data , 1997 .

[50]  K. Myambo,et al.  The GCR1 gene encodes a positive transcriptional regulator of the enolase and glyceraldehyde-3-phosphate dehydrogenase gene families in Saccharomyces cerevisiae , 1987, Molecular and cellular biology.

[51]  Eva Balsa-Canto,et al.  BioPreDyn-bench: a suite of benchmark problems for dynamic modelling in systems biology , 2015, BMC Systems Biology.

[52]  Gonzalo Guillén-Gosálbez,et al.  Deterministic global optimization algorithm based on outer approximation for the parameter estimation of nonlinear dynamic biological systems , 2012, BMC Bioinformatics.

[53]  Pierre Hansen,et al.  Variable Neighborhood Search , 2018, Handbook of Heuristics.

[54]  H. Künsch,et al.  Practical identifiability analysis of large environmental simulation models , 2001 .

[55]  Katharina Nöh,et al.  Current state and challenges for dynamic metabolic modeling. , 2016, Current opinion in microbiology.

[56]  Klaus Schittkowski,et al.  Numerical Data Fitting in Dynamical Systems: A Practical Introduction with Applications and Software , 2002 .

[57]  Daniel C. Zielinski,et al.  Personalized Whole-Cell Kinetic Models of Metabolism for Discovery in Genomics and Pharmacodynamics. , 2015, Cell systems.

[58]  Xiaohua Xia,et al.  On Identifiability of Nonlinear ODE Models and Applications in Viral Dynamics , 2011, SIAM Rev..

[59]  Eva Balsa-Canto,et al.  AMIGO2, a toolbox for dynamic modeling, optimization and control in systems biology , 2016, Bioinform..

[60]  Saltelli Andrea,et al.  Global Sensitivity Analysis: The Primer , 2008 .

[61]  Carmen G. Moles,et al.  Parameter estimation in biochemical pathways: a comparison of global optimization methods. , 2003, Genome research.

[62]  M S Turner,et al.  Modelling genetic networks with noisy and varied experimental data: the circadian clock in Arabidopsis thaliana. , 2005, Journal of theoretical biology.

[63]  Marija Cvijovic,et al.  Kinetic models in industrial biotechnology - Improving cell factory performance. , 2014, Metabolic engineering.

[64]  David Henriques,et al.  MEIGO: an open-source software suite based on metaheuristics for global optimization in systems biology and bioinformatics , 2013, BMC Bioinformatics.

[65]  Julio R. Banga,et al.  Reverse engineering and identification in systems biology: strategies, perspectives and challenges , 2014, Journal of The Royal Society Interface.

[66]  C. Floudas,et al.  Global Optimization for the Parameter Estimation of Differential-Algebraic Systems , 2000 .

[67]  Claire S. Adjiman,et al.  Global optimization of dynamic systems , 2004, Comput. Chem. Eng..

[68]  Keng C. Soh,et al.  Towards kinetic modeling of genome-scale metabolic networks without sacrificing stoichiometric, thermodynamic and physiological constraints. , 2013, Biotechnology journal.