Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems

BackgroundIn quantitative biology, mathematical models are used to describe and analyze biological processes. The parameters of these models are usually unknown and need to be estimated from experimental data using statistical methods. In particular, Markov chain Monte Carlo (MCMC) methods have become increasingly popular as they allow for a rigorous analysis of parameter and prediction uncertainties without the need for assuming parameter identifiability or removing non-identifiable parameters. A broad spectrum of MCMC algorithms have been proposed, including single- and multi-chain approaches. However, selecting and tuning sampling algorithms suited for a given problem remains challenging and a comprehensive comparison of different methods is so far not available.ResultsWe present the results of a thorough benchmarking of state-of-the-art single- and multi-chain sampling methods, including Adaptive Metropolis, Delayed Rejection Adaptive Metropolis, Metropolis adjusted Langevin algorithm, Parallel Tempering and Parallel Hierarchical Sampling. Different initialization and adaptation schemes are considered. To ensure a comprehensive and fair comparison, we consider problems with a range of features such as bifurcations, periodical orbits, multistability of steady-state solutions and chaotic regimes. These problem properties give rise to various posterior distributions including uni- and multi-modal distributions and non-normally distributed mode tails. For an objective comparison, we developed a pipeline for the semi-automatic comparison of sampling results.ConclusionThe comparison of MCMC algorithms, initialization and adaptation schemes revealed that overall multi-chain algorithms perform better than single-chain algorithms. In some cases this performance can be further increased by using a preceding multi-start local optimization scheme. These results can inform the selection of sampling methods and the benchmark collection can serve for the evaluation of new algorithms. Furthermore, our results confirm the need to address exploration quality of MCMC chains before applying the commonly used quality measure of effective sample size to prevent false analysis conclusions.

[1]  Gideon A. Ngwa,et al.  Observance of period-doubling bifurcation and chaos in an autonomous ODE model for malaria with vector demography , 2016, Theoretical Ecology.

[2]  Andrew Gelman,et al.  General methods for monitoring convergence of iterative simulations , 1998 .

[3]  Babak Shahbaba,et al.  Wormhole Hamiltonian Monte Carlo , 2014, AAAI.

[4]  M. Girolami,et al.  Inferring Signaling Pathway Topologies from Multiple Perturbation Measurements of Specific Biochemical Species , 2010, Science Signaling.

[5]  Jan Hasenauer,et al.  Analysis of CFSE time-series data using division-, age- and label-structured population models , 2016, Bioinform..

[6]  Carmen G. Moles,et al.  Parameter estimation in biochemical pathways: a comparison of global optimization methods. , 2003, Genome research.

[7]  Douglas Poland,et al.  Cooperative catalysis and chemical chaos: a chemical model for the Lorenz equations , 1993 .

[8]  M. Feinberg,et al.  Chemical mechanism structure and the coincidence of the stoichiometric and kinetic subspaces , 1977 .

[9]  Ajay Jasra,et al.  Markov Chain Monte Carlo Methods and the Label Switching Problem in Bayesian Mixture Modeling , 2005 .

[10]  B. Calderhead A study of Population MCMC for estimatingBayes Factors over nonlinear ODE models , 2008 .

[11]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[12]  M. Girolami,et al.  Inferring Signaling Pathway Topologies from Multiple Perturbation Measurements of Specific Biochemical Species , 2010, Science Signaling.

[13]  F. Allgöwer,et al.  Parameter Estimation and Identifiability of Biological Networks Using Relative Data , 2011 .

[14]  Thomas Wilhelm,et al.  The smallest chemical reaction system with bistability , 2009, BMC Systems Biology.

[15]  Mylène Bédard,et al.  Hierarchical models: Local proposal variances for RWM-within-Gibbs and MALA-within-Gibbs , 2017, Comput. Stat. Data Anal..

[16]  Julio R. Banga,et al.  Robust and efficient parameter estimation in dynamic models of biological systems , 2015, BMC Systems Biology.

[17]  E. Klipp,et al.  Integrative model of the response of yeast to osmotic shock , 2005, Nature Biotechnology.

[18]  Iason Papaioannou,et al.  Transitional Markov Chain Monte Carlo: Observations and Improvements , 2016 .

[19]  David J Schwab,et al.  From intracellular signaling to population oscillations: bridging size- and time-scales in collective behavior , 2015, Molecular systems biology.

[20]  H. Haario,et al.  An adaptive Metropolis algorithm , 2001 .

[21]  R. Unger,et al.  Chaos in protein dynamics , 1997, Proteins.

[22]  Torsten Hothorn,et al.  A unified framework of constrained regression , 2014, Stat. Comput..

[23]  J. Collins,et al.  Construction of a genetic toggle switch in Escherichia coli , 2000, Nature.

[24]  Grégory Faye,et al.  An introduction to bifurcation theory , 2011 .

[25]  Fabian J Theis,et al.  Lessons Learned from Quantitative Dynamical Modeling in Systems Biology , 2013, PloS one.

[26]  Parlitz,et al.  Period-doubling cascades and devil's staircases of the driven van der Pol oscillator. , 1987, Physical review. A, General physics.

[27]  Heikki Haario,et al.  Efficient MCMC for Climate Model Parameter Estimation: Parallel Adaptive Chains and Early Rejection , 2012 .

[28]  Leonard A. Smith,et al.  Rising Above Chaotic Likelihoods , 2014, SIAM/ASA J. Uncertain. Quantification.

[29]  G. Leonov,et al.  Localization of hidden Chuaʼs attractors , 2011 .

[30]  Ursula Klingmüller,et al.  Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood , 2009, Bioinform..

[31]  Gareth O. Roberts,et al.  Examples of Adaptive MCMC , 2009 .

[32]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[33]  Michael R. Guevara Bifurcations Involving Fixed Points and Limit Cycles in Biological Systems , 2003 .

[34]  Xingming Zhao,et al.  Computational Systems Biology , 2013, TheScientificWorldJournal.

[35]  Yuhong Yang,et al.  Information Theory, Inference, and Learning Algorithms , 2005 .

[36]  I. Mandel,et al.  Dynamic temperature selection for parallel tempering in Markov chain Monte Carlo simulations , 2015, 1501.05823.

[37]  G. Casella,et al.  Explaining the Gibbs Sampler , 1992 .

[38]  David Henriques,et al.  MEIGO: an open-source software suite based on metaheuristics for global optimization in systems biology and bioinformatics , 2013, BMC Bioinformatics.

[39]  Christophe Andrieu,et al.  A tutorial on adaptive MCMC , 2008, Stat. Comput..

[40]  Eric Moulines,et al.  Adaptive parallel tempering algorithm , 2012, 1205.1076.

[41]  Gareth O. Roberts,et al.  Convergence assessment techniques for Markov chain Monte Carlo , 1998, Stat. Comput..

[42]  M. Girolami,et al.  Riemann manifold Langevin and Hamiltonian Monte Carlo methods , 2011, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[43]  J. M. Roberts Full effects of oil rigs on corals are not yet known , 2000, Nature.

[44]  Eva Balsa-Canto,et al.  An iterative identification procedure for dynamic modeling of biochemical networks , 2010, BMC Systems Biology.

[45]  Ulrich Parlitz,et al.  BIFURCATION STRUCTURE OF THE DRIVEN VAN DER POL OSCILLATOR , 1993 .

[46]  Tina Toni,et al.  Parameter inference for biochemical systems that undergo a Hopf bifurcation. , 2008, Biophysical journal.

[47]  Sergio Cerutti,et al.  Dynamical Systems and Their Bifurcations , 2011 .

[48]  Andreas Zell,et al.  The EvA2 Optimization Framework , 2010, LION.

[49]  Fabian J Theis,et al.  A vine-copula based adaptive MCMC sampler for efficient inference of dynamical systems , 2013 .

[50]  Fabian J Theis,et al.  High-dimensional Bayesian parameter estimation: case study for a model of JAK2/STAT5 signaling. , 2013, Mathematical biosciences.

[51]  R. Pérez,et al.  Bifurcation and chaos in a periodically stimulated cardiac oscillator , 1983 .

[52]  Matthias Bethge,et al.  Signatures of criticality arise from random subsampling in simple population models , 2016, PLoS Comput. Biol..

[53]  Malcolm Sambridge,et al.  A Parallel Tempering algorithm for probabilistic sampling and multimodal optimization , 2014 .

[54]  Sven Sahle,et al.  Exploiting intrinsic fluctuations to identify model parameters , 2015, IET systems biology.

[55]  Frank Allgöwer,et al.  Bistable Biological Systems: A Characterization Through Local Compact Input-to-State Stability , 2008, IEEE Transactions on Automatic Control.

[56]  Eva Balsa-Canto,et al.  BioPreDyn-bench: a suite of benchmark problems for dynamic modelling in systems biology , 2015, BMC Systems Biology.

[57]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[58]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[59]  Heikki Haario,et al.  DRAM: Efficient adaptive MCMC , 2006, Stat. Comput..

[60]  A Kremling,et al.  Exploiting the bootstrap method for quantifying parameter confidence intervals in dynamical systems. , 2006, Metabolic engineering.

[61]  Richard J Morris,et al.  Differential and chaotic calcium signatures in the symbiosis signaling pathway of legumes , 2008, Proceedings of the National Academy of Sciences.

[62]  J. Tyson Modeling the cell division cycle: cdc2 and cyclin interactions. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[63]  Fabio Rigat,et al.  Parallel hierarchical sampling: A general-purpose interacting Markov chains Monte Carlo algorithm , 2012, Comput. Stat. Data Anal..

[64]  Thomas S. Ligon,et al.  Single-cell mRNA transfection studies: delivery, kinetics and statistics by numbers. , 2014, Nanomedicine : nanotechnology, biology, and medicine.

[65]  Fabian J. Theis,et al.  Uncertainty Analysis for Non-identifiable Dynamical Systems: Profile Likelihoods, Bootstrapping and More , 2014, CMSB.

[66]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[67]  B. Kholodenko,et al.  Negative feedback and ultrasensitivity can bring about oscillations in the mitogen-activated protein kinase cascades. , 2000, European journal of biochemistry.

[68]  Darren J. Wilkinson,et al.  Bayesian methods in bioinformatics and computational systems biology , 2006, Briefings Bioinform..

[69]  Johan Karlsson,et al.  Comparison of approaches for parameter identifiability analysis of biological systems , 2014, Bioinform..

[70]  George Iliopoulos,et al.  On the Convergence Rate of Random Permutation Sampler and ECR Algorithm in Missing Data Models , 2013 .

[71]  Yan Bai,et al.  Divide and Conquer: A Mixture-Based Approach to Regional Adaptation for MCMC , 2009 .

[72]  R. Mark,et al.  Computational modeling of cardiovascular response to orthostatic stress. , 2002, Journal of applied physiology.

[73]  Mauricio D. Sacchi,et al.  Interpolation and extrapolation using a high-resolution discrete Fourier transform , 1998, IEEE Trans. Signal Process..

[74]  Matti Vihola,et al.  Robust adaptive Metropolis algorithm with coerced acceptance rate , 2010, Statistics and Computing.

[75]  Stephen P. Brooks,et al.  Assessing Convergence of Markov Chain Monte Carlo Algorithms , 2007 .

[76]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[77]  Fabian J. Theis,et al.  Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks , 2016, bioRxiv.

[78]  U. Alon An introduction to systems biology : design principles of biological circuits , 2019 .

[79]  Ertugrul M. Ozbudak,et al.  Multistability in the lactose utilization network of Escherichia coli , 2004, Nature.

[80]  B. Calderhead Differential geometric MCMC methods and applications , 2011 .

[81]  I. VagaitsevV.,et al.  Localization of hidden Chua ’ s attractors , 2022 .

[82]  John Geweke,et al.  Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments , 1991 .

[83]  Mateusz Krzysztof Lacki,et al.  State-dependent swap strategies and automatic reduction of number of temperatures in adaptive parallel tempering algorithm , 2016, Stat. Comput..

[84]  Eva Balsa-Canto,et al.  Bioinformatics Applications Note Systems Biology Genssi: a Software Toolbox for Structural Identifiability Analysis of Biological Models , 2022 .

[85]  Monte Carlo,et al.  Exploration of order in chaos using the replica exchange Monte Carlo method , 2008, 0811.2944.

[86]  Stefan Willmann,et al.  Using Bayesian-PBPK modeling for assessment of inter-individual variability and subgroup stratification , 2013, In Silico Pharmacology.

[87]  J. A. Kuznecov Elements of applied bifurcation theory , 1998 .

[88]  Jens Timmer,et al.  Joining forces of Bayesian and frequentist methodology: a study for inference in the presence of non-identifiability , 2012, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.