Systems biology: model based evaluation and comparison of potential explanations for given biological data

Systems biology and its usage of mathematical modeling to analyse biological data is rapidly becoming an established approach to biology. A crucial advantage of this approach is that more information can be extracted from observations of intricate dynamics, which allows nontrivial complex explanations to be evaluated and compared. In this minireview we explain this process, and review some of the most central available analysis tools. The focus is on the evaluation and comparison of given explanations for a given set of experimental data and prior knowledge. Three types of methods are discussed: (a) for evaluation of whether a given model is sufficiently able to describe the given data to be nonrejectable; (b) for evaluation of whether a slightly superior model is significantly better; and (c) for a general evaluation and comparison of the biologically interesting features in a model. The most central methods are reviewed, both in terms of underlying assumptions, including references to more advanced literature for the theoretically oriented reader, and in terms of practical guidelines and examples, for the practically oriented reader. Many of the methods are based upon analysis tools from statistics and engineering, and we emphasize that the systems biology focus on acceptable explanations puts these methods in a nonstandard setting. We highlight some associated future improvements that will be essential for future developments of model based data analysis in biology.

[1]  Q. Vuong Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses , 1989 .

[2]  K. Liang,et al.  Asymptotic Properties of Maximum Likelihood Estimators and Likelihood Ratio Tests under Nonstandard Conditions , 1987 .

[3]  J. Miller,et al.  Asymptotic Properties of Maximum Likelihood Estimates in the Mixed Model of the Analysis of Variance , 1977 .

[4]  John Hinde,et al.  Choosing Between Non-nested Models: a Simulation Approach , 1992 .

[5]  H. Gutfreund,et al.  Enzyme kinetics , 1975, Nature.

[6]  Barbara Di Ventura,et al.  From in vivo to in silico biology and back , 2006, Nature.

[7]  Alexandre Sedoglavic A Probabilistic Algorithm to Test Local Algebraic Observability in Polynomial Time , 2002, J. Symb. Comput..

[8]  Ian G. Barbour Religion and science : historical and contemporary issues , 1997 .

[9]  Jens Timmer,et al.  An error model for protein quantification , 2007, Bioinform..

[10]  H. Chernoff On the Distribution of the Likelihood Ratio , 1954 .

[11]  Anthony C. Davison,et al.  Bootstrap Methods and Their Application , 1998 .

[12]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[13]  K. Popper,et al.  Conjectures and refutations;: The growth of scientific knowledge , 1972 .

[14]  D. Cox Tests of Separate Families of Hypotheses , 1961 .

[15]  J. Timmer,et al.  Systems biology: experimental design , 2009, The FEBS journal.

[16]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[17]  D A Williams Discrimination between regression models to determine the pattern of enzyme synthesis in synchronous cell cultures. , 1970, Biometrics.

[18]  K. Popper,et al.  Conjectures and refutations;: The growth of scientific knowledge , 1972 .

[19]  Alexandre Sedoglavic A probabilistic algorithm to test local algebraic observability in polynomial time , 2001, ISSAC '01.

[20]  Susan A. Murphy,et al.  Monographs on statistics and applied probability , 1990 .

[21]  Axel Kowald,et al.  Systems Biology in Practice: Concepts, Implementation and Application , 2005 .

[22]  Mats Jirstrand,et al.  Systems biology Systems Biology Toolbox for MATLAB : a computational platform for research in systems biology , 2006 .

[23]  A. Shapiro Asymptotic distribution of test statistics in the analysis of moment structures under inequality constraints , 1985 .

[24]  Maliha S. Nash,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 2001, Technometrics.

[25]  Tao Wang,et al.  Tuning Strategies in Constrained Simulated Annealing for Nonlinear Global Optimization , 2000, Int. J. Artif. Intell. Tools.

[26]  Gunnar Cedersund Core-box Modelling - Theoretical Contributions and Applications to Glucose Homeostasis Related Systems , 2006 .

[27]  Barbara M. Bakker,et al.  Can yeast glycolysis be understood in terms of in vitro kinetics of the constituent enzymes? Testing biochemistry. , 2000, European journal of biochemistry.

[28]  N. Shephard,et al.  Stochastic Volatility: Likelihood Inference And Comparison With Arch Models , 1996 .

[29]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[30]  Susan R. Wilson,et al.  Two guidelines for bootstrap hypothesis testing , 1991 .

[31]  Ursula Klingmüller,et al.  Tests for cycling in a signalling pathway , 2004 .

[32]  Maksat Ashyraliyev,et al.  Systems biology: parameter estimation for biochemical models , 2009, The FEBS journal.

[33]  T. Kuhn,et al.  The Structure of Scientific Revolutions. , 1964 .

[34]  D. Chant,et al.  On asymptotic tests of composite hypotheses in nonstandard conditions , 1974 .

[35]  H. Friedl Econometric Analysis of Count Data , 2002 .

[36]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[37]  P. Sprent,et al.  The mathematics of size and shape. , 1972, Biometrics.

[38]  Jacob Roll,et al.  Model-Based Hypothesis Testing of Key Mechanisms in Initial Phase of Insulin Signaling , 2008, PLoS Comput. Biol..

[39]  L. Godfrey On the asymptotic validity of a bootstrap method for testing nonnested hypotheses , 2007 .

[40]  Hirotugu Akaike,et al.  MODERN DEVELOPMENT OF STATISTICAL METHODS , 1981 .

[41]  A. Rodrigo,et al.  Likelihood-based tests of topologies in phylogenetics. , 2000, Systematic biology.

[42]  H. Kitano,et al.  Computational systems biology , 2002, Nature.

[43]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[44]  Gopal Kanji,et al.  100 Statistical Tests , 1994 .

[45]  E. Mammen The Bootstrap and Edgeworth Expansion , 1997 .

[46]  Allan Gut,et al.  An intermediate course in probability , 1995 .

[47]  Ursula Klingmüller,et al.  Modeling the Nonlinear Dynamics of Cellular Signal Transduction , 2004, Int. J. Bifurc. Chaos.

[48]  David Deutsch,et al.  The fabric of reality : the science of parallel universes-- and its implications , 1997 .

[49]  B. Efron The jackknife, the bootstrap, and other resampling plans , 1987 .

[50]  G. Cedersund,et al.  Conservation laws and unidentifiability of rate expressions in biochemical models. , 2007, IET systems biology.

[51]  F. Hynne,et al.  Full-scale model of glycolysis in Saccharomyces cerevisiae. , 2001, Biophysical chemistry.

[52]  J. Timmer,et al.  Identification of nucleocytoplasmic cycling as a remote sensor in cellular signaling by databased modeling , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[53]  Williams Da,et al.  Discrimination between regression models to determine the pattern of enzyme synthesis in synchronous cell cultures. , 1970 .

[54]  H. Akaike A new look at the statistical model identification , 1974 .