How informative is your kinetic model?: using resampling methods for model invalidation

BackgroundKinetic models can present mechanistic descriptions of molecular processes within a cell. They can be used to predict the dynamics of metabolite production, signal transduction or transcription of genes. Although there has been tremendous effort in constructing kinetic models for different biological systems, not much effort has been put into their validation. In this study, we introduce the concept of resampling methods for the analysis of kinetic models and present a statistical model invalidation approach.ResultsWe based our invalidation approach on the evaluation of a kinetic model’s predictive power through cross validation and forecast analysis. As a reference point for this evaluation, we used the predictive power of an unsupervised data analysis method which does not make use of any biochemical knowledge, namely Smooth Principal Components Analysis (SPCA) on the same test sets. Through a simulations study, we showed that too simple mechanistic descriptions can be invalidated by using our SPCA-based comparative approach until high amount of noise exists in the experimental data. We also applied our approach on an eicosanoid production model developed for human and concluded that the model could not be invalidated using the available data despite its simplicity in the formulation of the reaction kinetics. Furthermore, we analysed the high osmolarity glycerol (HOG) pathway in yeast to question the validity of an existing model as another realistic demonstration of our method.ConclusionsWith this study, we have successfully presented the potential of two resampling methods, cross validation and forecast analysis in the analysis of kinetic models’ validity. Our approach is easy to grasp and to implement, applicable to any ordinary differential equation (ODE) type biological model and does not suffer from any computational difficulties which seems to be a common problem for approaches that have been proposed for similar purposes. Matlab files needed for invalidation using SPCA cross validation and our toy model in SBML format are provided at http://www.bdagroup.nl/content/Downloads/software/software.php.

[1]  Jacob Roll,et al.  Systems biology: model based evaluation and comparison of potential explanations for given biological data , 2009, The FEBS journal.

[2]  Luhua Lai,et al.  Dynamic Simulations on the Arachidonic Acid Metabolic Network , 2007, PLoS Comput. Biol..

[3]  David Welch,et al.  Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems , 2009, Journal of The Royal Society Interface.

[4]  J. Snoep,et al.  From steady‐state to synchronized yeast glycolytic oscillations II: model validation , 2012, The FEBS journal.

[5]  Georgios Papachristoudis,et al.  Human microRNA target analysis and gene ontology clustering by GOmir, a novel stand-alone application , 2009, BMC Bioinformatics.

[6]  M. Gustin,et al.  MAP Kinase Pathways in the YeastSaccharomyces cerevisiae , 1998, Microbiology and Molecular Biology Reviews.

[7]  Ursula Klingmüller,et al.  Modeling the Nonlinear Dynamics of Cellular Signal Transduction , 2004, Int. J. Bifurc. Chaos.

[8]  John Lygeros,et al.  Bayesian model selection for the yeast GATA-factor network: A comparison of computational approaches , 2010, 49th IEEE Conference on Decision and Control (CDC).

[9]  Judith B. Zaugg,et al.  Bacterial adaptation through distributed sensing of metabolic fluxes , 2010, Molecular systems biology.

[10]  Michael P. H. Stumpf,et al.  Simulation-based model selection for dynamical systems in systems and population biology , 2009, Bioinform..

[11]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1971 .

[12]  S. Hohmann Osmotic Stress Signaling and Osmoadaptation in Yeasts , 2002, Microbiology and Molecular Biology Reviews.

[13]  Bob Kooi,et al.  From steady‐state to synchronized yeast glycolytic oscillations I: model construction , 2012, The FEBS journal.

[14]  Age K. Smilde,et al.  Analysis of longitudinal metabolomics data , 2004, Bioinform..

[15]  Francesc Posas,et al.  Yeast HOG1 MAP Kinase Cascade Is Regulated by a Multistep Phosphorelay Mechanism in the SLN1–YPD1–SSK1 “Two-Component” Osmosensor , 1996, Cell.

[16]  I. Jolliffe Principal Component Analysis , 2002 .

[17]  A. Smilde,et al.  Assessing the statistical validity of proteomics based biomarkers. , 2007, Analytica chimica acta.

[18]  Bernd Freisleben,et al.  Investigating the dynamic behavior of biochemical networks using model families , 2005, Bioinform..

[19]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[20]  Jacky L. Snoep,et al.  BioModels Database: a free, centralized database of curated, published, quantitative kinetic models of biochemical and cellular systems , 2005, Nucleic Acids Res..

[21]  Mano Ram Maurya,et al.  An integrated model of eicosanoid metabolism and signaling based on lipidomics flux analysis. , 2009, Biophysical journal.

[22]  R Bro,et al.  Cross-validation of component models: A critical look at current methods , 2008, Analytical and bioanalytical chemistry.

[23]  Julie Josse,et al.  Handling missing values in exploratory multivariate data analysis methods , 2012 .

[24]  Heinz Koeppl,et al.  ‘Glocal’ Robustness Analysis and Model Discrimination for Circadian Oscillators , 2009, PLoS Comput. Biol..

[25]  Douglas B. Kell,et al.  Non-linear optimization of biochemical pathways: applications to metabolic engineering and parameter estimation , 1998, Bioinform..

[26]  M. Verouden Fusing prior knowledge with microbial metabolomics , 2012 .

[27]  P Mendes,et al.  Modelling and simulation for metabolomics data analysis. , 2005, Biochemical Society transactions.

[28]  Edda Klipp,et al.  ModelMage: a tool for automatic model generation, selection and management. , 2008, Genome informatics. International Conference on Genome Informatics.

[29]  J. Stelling,et al.  Ensemble modeling for analysis of cell signaling dynamics , 2007, Nature Biotechnology.

[30]  Ricard Solé,et al.  Dynamic Signaling in the Hog1 MAPK Pathway Relies on High Basal Signal Transduction , 2009, Science Signaling.

[31]  Barbara M. Bakker,et al.  Can yeast glycolysis be understood in terms of in vitro kinetics of the constituent enzymes? Testing biochemistry. , 2000, European journal of biochemistry.

[32]  H. Akaike A new look at the statistical model identification , 1974 .

[33]  An integrated omics analysis of eicosanoid biology , 2009 .

[34]  H. Kiers Weighted least squares fitting using ordinary least squares algorithms , 1997 .

[35]  D G Bates,et al.  Validation and invalidation of systems biology models using robustness analysis. , 2011, IET systems biology.

[36]  Edda Klipp,et al.  Modelling reveals novel roles of two parallel signalling pathways and homeostatic feedbacks in yeast , 2012, Molecular systems biology.

[37]  Thomas F. Coleman,et al.  An Interior Trust Region Approach for Nonlinear Minimization Subject to Bounds , 1993, SIAM J. Optim..

[38]  U. Sauer,et al.  Systematic identification of allosteric protein-metabolite interactions that control enzyme activity in vivo , 2013, Nature Biotechnology.

[39]  F. Turkheimer,et al.  On the Undecidability among Kinetic Models: From Model Selection to Model Averaging , 2003, Journal of cerebral blood flow and metabolism : official journal of the International Society of Cerebral Blood Flow and Metabolism.

[40]  Mark A. Girolami,et al.  Bayesian ranking of biochemical system models , 2008, Bioinform..

[41]  Antonis Papachristodoulou,et al.  On validation and invalidation of biological models , 2009, BMC Bioinformatics.

[42]  Carmen G. Moles,et al.  Parameter estimation in biochemical pathways: a comparison of global optimization methods. , 2003, Genome research.