Simulation and inference algorithms for stochastic biochemical reaction networks: from basic concepts to state-of-the-art

Stochasticity is a key characteristic of intracellular processes such as gene regulation and chemical signalling. Therefore, characterizing stochastic effects in biochemical systems is essential to understand the complex dynamics of living things. Mathematical idealizations of biochemically reacting systems must be able to capture stochastic phenomena. While robust theory exists to describe such stochastic models, the computational challenges in exploring these models can be a significant burden in practice since realistic models are analytically intractable. Determining the expected behaviour and variability of a stochastic biochemical reaction network requires many probabilistic simulations of its evolution. Using a biochemical reaction network model to assist in the interpretation of time-course data from a biological experiment is an even greater challenge due to the intractability of the likelihood function for determining observation probabilities. These computational challenges have been subjects of active research for over four decades. In this review, we present an accessible discussion of the major historical developments and state-of-the-art computational techniques relevant to simulation and inference problems for stochastic biochemical reaction network models. Detailed algorithms for particularly important methods are described and complemented with Matlab® implementations. As a result, this review provides a practical and accessible introduction to computational methods for stochastic models within the life sciences community.

[1]  R. Milo,et al.  Noise in gene expression is coupled to growth rate , 2015, Genome research.

[2]  L. Mark Berliner,et al.  Subsampling the Gibbs Sampler , 1994 .

[3]  Tianhai Tian,et al.  Stochastic models for inferring genetic regulation from microarray gene expression data , 2010, Biosyst..

[4]  James Briscoe,et al.  Ptch1 and Gli regulate Shh signalling dynamics via multiple mechanisms , 2015, Nature Communications.

[5]  Corrado Priami,et al.  HRSSA - Efficient hybrid stochastic simulation for spatially homogeneous biochemical reaction networks , 2016, J. Comput. Phys..

[6]  R. Baker,et al.  Multi-level methods and approximating distribution functions , 2016, 1604.05102.

[7]  A. Arkin,et al.  Stochastic mechanisms in gene expression. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[8]  D. L. Sean McElwain,et al.  Interpreting scratch assays using pair density dynamics and approximate Bayesian computation , 2014, Open Biology.

[9]  A. Arkin,et al.  Stochastic kinetic analysis of developmental pathway bifurcation in phage lambda-infected Escherichia coli cells. , 1998, Genetics.

[10]  Hong Li,et al.  Algorithms and Software for Stochastic Simulation of Biochemical Reacting Systems , 2008, Biotechnology progress.

[11]  Alex K. Shalek,et al.  Heterogeneity in immune responses: from populations to single cells. , 2014, Trends in immunology.

[12]  Jun Chu,et al.  A Guide to Fluorescent Protein FRET Pairs , 2016, Sensors.

[13]  Andrew Parker,et al.  Using approximate Bayesian computation to quantify cell–cell adhesion parameters in a cell migratory process , 2016, npj Systems Biology and Applications.

[14]  Itaru Imayoshi,et al.  Light Control of the Tet Gene Expression System in Mammalian Cells. , 2018, Cell reports.

[15]  D. Gillespie The chemical Langevin equation , 2000 .

[16]  Christian A. Yates,et al.  An adaptive multi-level simulation algorithm for stochastic biological systems. , 2014, The Journal of chemical physics.

[17]  Stefan Heinrich,et al.  Multilevel Monte Carlo Methods , 2001, LSSC.

[18]  A. Beskos,et al.  Multilevel sequential Monte Carlo samplers , 2015, 1503.07259.

[19]  Timo R. Maarleveld,et al.  StochPy: A Comprehensive, User-Friendly Tool for Simulating Stochastic Biological Processes , 2013, PloS one.

[20]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[21]  Erry Gunawan,et al.  Blue light-mediated transcriptional activation and repression of gene expression in bacteria , 2016, Nucleic acids research.

[22]  M. Thattai,et al.  Intrinsic noise in gene regulatory networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Aidan P Thompson,et al.  A constant-time kinetic Monte Carlo algorithm for simulation of large biochemical reaction networks. , 2008, The Journal of chemical physics.

[24]  G. Roberts,et al.  MCMC Methods for Functions: ModifyingOld Algorithms to Make Them Faster , 2012, 1202.0709.

[25]  Xiaosi Tan,et al.  Multilevel approximate Bayesian approaches for flows in highly heterogeneous porous media and their applications , 2017, J. Comput. Appl. Math..

[26]  Rudiyanto Gunawan,et al.  SINCERITIES: inferring gene regulatory networks from time-stamped single cell transcriptional expression profiles , 2016, bioRxiv.

[27]  Andrew R. Francis,et al.  Using Approximate Bayesian Computation to Estimate Tuberculosis Transmission Parameters From Genotype Data , 2006, Genetics.

[28]  C C Drovandi,et al.  Estimation of Parameters for Macroparasite Population Evolution Using Approximate Bayesian Computation , 2011, Biometrics.

[29]  Raul Tempone,et al.  Multilevel Monte Carlo in approximate Bayesian computation , 2017, Stochastic Analysis and Applications.

[30]  J. Møller Discussion on the paper by Feranhead and Prangle , 2012 .

[31]  Philipp Thomas,et al.  Stochastic Simulation of Biomolecular Networks in Dynamic Environments , 2015, PLoS Comput. Biol..

[32]  Donald L. Iglehart,et al.  Importance sampling for stochastic simulations , 1989 .

[33]  D. Gillespie Exact Stochastic Simulation of Coupled Chemical Reactions , 1977 .

[34]  Eric Vanden-Eijnden,et al.  Nested stochastic simulation algorithm for chemical kinetic systems with disparate rates. , 2005, The Journal of chemical physics.

[35]  A. Oudenaarden,et al.  Nature, Nurture, or Chance: Stochastic Gene Expression and Its Consequences , 2008, Cell.

[36]  P. Moral,et al.  Sequential Monte Carlo samplers , 2002, cond-mat/0212648.

[37]  Desmond J. Higham,et al.  An Algorithmic Introduction to Numerical Simulation of Stochastic Differential Equations , 2001, SIAM Rev..

[38]  S. McCue,et al.  A Bayesian Computational Approach to Explore the Optimal Duration of a Cell Proliferation Assay , 2017, Bulletin of Mathematical Biology.

[39]  A. Feinberg,et al.  Epigenetic stochasticity, nuclear structure and cancer: the implications for medicine , 2014, Journal of internal medicine.

[40]  Paul J. Choi,et al.  Quantifying E. coli Proteome and Transcriptome with Single-Molecule Sensitivity in Single Cells , 2010, Science.

[41]  K. Burrage,et al.  Numerical methods for strong solutions of stochastic differential equations: an overview , 2004, Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[42]  PriamiCorrado,et al.  HRSSA - Efficient hybrid stochastic simulation for spatially homogeneous biochemical reaction networks , 2016 .

[43]  Frank Moss,et al.  Neurons in parallel , 1995, Nature.

[44]  Fabian J. Theis,et al.  Reconstructing gene regulatory dynamics from high-dimensional single-cell snapshot data , 2015, Bioinform..

[45]  David F Anderson,et al.  A modified next reaction method for simulating chemical systems with time dependent propensities and delays. , 2007, The Journal of chemical physics.

[46]  David Welch,et al.  Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems , 2009, Journal of The Royal Society Interface.

[47]  David F. Anderson,et al.  Error analysis of tau-leap simulation methods , 2009, 0909.4790.

[48]  D. Gillespie Approximate accelerated stochastic simulation of chemically reacting systems , 2001 .

[49]  Paul C. Bressloff,et al.  Stochastic switching in biology: from genotype to phenotype , 2017 .

[50]  Gareth O. Roberts,et al.  Examples of Adaptive MCMC , 2009 .

[51]  Duarte Antunes,et al.  Intercellular Variability in Protein Levels from Stochastic Expression and Noisy Cell Cycle Processes , 2016, PLoS Comput. Biol..

[52]  Wesley R. Legant,et al.  Lattice light-sheet microscopy: Imaging molecules to embryos at high spatiotemporal resolution , 2014, Science.

[53]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[54]  Michael P. H. Stumpf,et al.  Maximizing the Information Content of Experiments in Systems Biology , 2013, PLoS Comput. Biol..

[55]  S. Hell,et al.  Fluorescence nanoscopy in cell biology , 2017, Nature Reviews Molecular Cell Biology.

[56]  D. Balding,et al.  Approximate Bayesian computation in population genetics. , 2002, Genetics.

[57]  T. Elston,et al.  Stochasticity in gene expression: from theories to phenotypes , 2005, Nature Reviews Genetics.

[58]  Radek Erban,et al.  Error Analysis of Diffusion Approximation Methods for Multiscale Systems in Reaction Kinetics , 2014, SIAM J. Sci. Comput..

[59]  William A. Link,et al.  On thinning of chains in MCMC , 2012 .

[60]  Peter Guttorp,et al.  Evidence that hematopoiesis may be a stochastic process in vivo , 1996, Nature Medicine.

[61]  Ramon Grima,et al.  Single-cell variability in multicellular organisms , 2018, Nature Communications.

[62]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[63]  Matthew J Simpson,et al.  A Bayesian Sequential Learning Framework to Parameterise Continuum Models of Melanoma Invasion into Human Skin , 2019, Bulletin of mathematical biology.

[64]  Brenda N. Vo,et al.  Quantifying uncertainty in parameter estimates for stochastic models of collective cell spreading using approximate Bayesian computation. , 2015, Mathematical biosciences.

[65]  Hong Li,et al.  Efficient formulation of the stochastic simulation algorithm for chemically reacting systems. , 2004, The Journal of chemical physics.

[66]  D. Wilkinson Stochastic modelling for quantitative description of heterogeneous biological systems , 2009, Nature Reviews Genetics.

[67]  R. Erban,et al.  Stochastic modelling of reaction–diffusion processes: algorithms for bimolecular reactions , 2009, Physical biology.

[68]  Keng C Chou,et al.  Review of Super-Resolution Fluorescence Microscopy for Biology , 2011, Applied spectroscopy.

[69]  P. Maini,et al.  A practical guide to stochastic simulations of reaction-diffusion processes , 2007, 0704.1908.

[70]  Muruhan Rathinam,et al.  Stiffness in stochastic chemically reacting systems: The implicit tau-leaping method , 2003 .

[71]  Dan ie l T. Gil lespie A rigorous derivation of the chemical master equation , 1992 .

[72]  Linda R Petzold,et al.  Efficient step size selection for the tau-leaping simulation method. , 2006, The Journal of chemical physics.

[73]  Christian P. Robert,et al.  Bayesian computation: a summary of the current state, and samples backwards and forwards , 2015, Statistics and Computing.

[74]  A. Doucet,et al.  Particle Markov chain Monte Carlo methods , 2010 .

[75]  D. McMillen,et al.  Dark proteins: Effect of inclusion body formation on quantification of protein expression , 2008, Proteins.

[76]  Sheng Wu,et al.  StochKit2: software for discrete stochastic simulation of biochemical systems with events , 2011, Bioinform..

[77]  Noah A Rosenberg,et al.  AABC: approximate approximate Bayesian computation for inference in population-genetic models. , 2015, Theoretical population biology.

[78]  K. Burrage,et al.  Stochastic models for regulatory networks of the genetic toggle switch. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[79]  Jeffrey W. Smith,et al.  Stochastic Gene Expression in a Single Cell , .

[80]  J. Elf,et al.  Stochastic reaction-diffusion kinetics in the microscopic limit , 2010, Proceedings of the National Academy of Sciences.

[81]  Alexander G. Fletcher,et al.  A hierarchical Bayesian model for understanding the spatiotemporal dynamics of the intestinal epithelium , 2017, PLoS Comput. Biol..

[82]  Katia Koelle,et al.  Phylodynamic Inference and Model Assessment with Approximate Bayesian Computation: Influenza as a Case Study , 2012, PLoS Comput. Biol..

[83]  Brian Dennis,et al.  Data cloning: easy maximum likelihood estimation for complex ecological models using Bayesian Markov chain Monte Carlo methods. , 2007, Ecology letters.

[84]  Desmond J. Higham,et al.  Modeling and Simulating Chemical Reactions , 2008, SIAM Rev..

[85]  Helen M Byrne,et al.  Bayesian inference of agent-based models: a tool for studying kidney branching morphogenesis , 2017, Journal of Mathematical Biology.

[86]  M. Ehrenberg,et al.  Stochastic focusing: fluctuation-enhanced sensitivity of intracellular regulation. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[87]  M. Blum Approximate Bayesian Computation: A Nonparametric Perspective , 2009, 0904.0635.

[88]  O. François,et al.  Approximate Bayesian Computation (ABC) in practice. , 2010, Trends in ecology & evolution.

[89]  D. Gillespie,et al.  Avoiding negative populations in explicit Poisson tau-leaping. , 2005, The Journal of chemical physics.

[90]  John Lygeros,et al.  Iterative experiment design guides the characterization of a light-inducible gene expression circuit , 2015, Proceedings of the National Academy of Sciences.

[91]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[92]  Yanan Fan,et al.  Handbook of Approximate Bayesian Computation , 2018 .

[93]  Laurent Excoffier,et al.  ABCtoolbox: a versatile toolkit for approximate Bayesian computations , 2010, BMC Bioinformatics.

[94]  Stuart Barber,et al.  The Rate of Convergence for Approximate Bayesian Computation , 2013, 1311.2038.

[95]  Vo Hong Thanh Stochastic simulation of biochemical reactions with partial-propensity and rejection-based approaches. , 2017, Mathematical biosciences.

[96]  Paul Marjoram,et al.  Markov chain Monte Carlo without likelihoods , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[97]  Aaron M. Ellison,et al.  Bayesian inference in ecology , 2004 .

[98]  Charles J. Geyer,et al.  Practical Markov Chain Monte Carlo , 1992 .

[99]  Corrado Priami,et al.  Efficient rejection-based simulation of biochemical reactions with stochastic noise and delays. , 2014, The Journal of chemical physics.

[100]  Philip K Maini,et al.  Models, measurement and inference in epithelial tissue dynamics. , 2015, Mathematical biosciences and engineering : MBE.

[101]  Mark M. Tanaka,et al.  Sequential Monte Carlo without likelihoods , 2007, Proceedings of the National Academy of Sciences.

[102]  W. Fontana,et al.  Small Numbers of Big Molecules , 2002, Science.

[103]  M. Feldman,et al.  Population growth of human Y chromosomes: a study of Y chromosome microsatellites. , 1999, Molecular biology and evolution.

[104]  Linda R Petzold,et al.  Validity conditions for stochastic chemical kinetics in diffusion-limited systems. , 2014, The Journal of chemical physics.

[105]  Ramon Grima,et al.  Breakdown of the reaction-diffusion master equation with nonelementary rates. , 2016, Physical review. E.

[106]  T. J. Dodwell,et al.  A Hierarchical Multilevel Markov Chain Monte Carlo Algorithm with Applications to Uncertainty Quantification in Subsurface Flow , 2013, SIAM/ASA J. Uncertain. Quantification.

[107]  K. Burrage,et al.  Binomial leap methods for simulating stochastic chemical kinetics. , 2004, The Journal of chemical physics.

[108]  Michael B. Giles,et al.  Multilevel Monte Carlo Path Simulation , 2008, Oper. Res..

[109]  Kevin Burrage,et al.  Stochastic approaches for modelling in vivo reactions , 2004, Comput. Biol. Chem..

[110]  P. Donnelly,et al.  Inferring coalescence times from DNA sequence data. , 1997, Genetics.

[111]  Tiejun Li,et al.  Analysis of Explicit Tau-Leaping Schemes for Simulating Chemically Reacting Systems , 2007, Multiscale Model. Simul..

[112]  M. Gutmann,et al.  Approximate Bayesian Computation , 2019, Annual Review of Statistics and Its Application.

[113]  Erik De Schutter,et al.  STEPS: efficient simulation of stochastic reaction–diffusion models in realistic morphologies , 2012, BMC Systems Biology.

[114]  A. P. Dawid,et al.  Parameter inference for stochastic kinetic models of bacterial gene regulation : a Bayesian approach to systems biology , 2010 .

[115]  Michael A. Gibson,et al.  Efficient Exact Stochastic Simulation of Chemical Systems with Many Species and Many Channels , 2000 .

[116]  Tianhai Tian,et al.  An integrated approach to infer dynamic protein-gene interactions - A case study of the human P53 protein. , 2016, Methods.

[117]  S. Isaacson Relationship between the reaction–diffusion master equation and particle tracking models , 2008 .

[118]  Mudita Singhal,et al.  COPASI - a COmplex PAthway SImulator , 2006, Bioinform..

[119]  Darren J Wilkinson,et al.  Bayesian parameter inference for stochastic biochemical network models using particle Markov chain Monte Carlo , 2011, Interface Focus.

[120]  Guy S. Salvesen,et al.  SnapShot: Caspases , 2011, Cell.

[121]  W. Huisinga,et al.  Solving the chemical master equation for monomolecular reaction systems analytically , 2006, Journal of mathematical biology.

[122]  Sarah Filippi,et al.  A framework for parameter estimation and model selection from experimental data in systems biology using approximate Bayesian computation , 2014, Nature Protocols.

[123]  Matthew J Simpson,et al.  Optimal Quantification of Contact Inhibition in Cell Populations. , 2017, Biophysical journal.

[124]  Bangti Jin,et al.  Multilevel Markov Chain Monte Carlo Method for High-Contrast Single-Phase Flow Problems , 2014, 1402.5068.

[125]  Peter A. J. Hilbers,et al.  Optimal experiment design for model selection in biochemical networks , 2014, BMC Systems Biology.

[126]  Christian A. Yates,et al.  Extending the Multi-level Method for the Simulation of Stochastic Biological Systems , 2014, Bulletin of Mathematical Biology.

[127]  Matthew J Simpson,et al.  Using Experimental Data and Information Criteria to Guide Model Selection for Reaction–Diffusion Problems in Mathematical Biology , 2018, bioRxiv.

[128]  Elijah Roberts,et al.  Approximation and inference methods for stochastic biochemical kinetics—a tutorial review , 2017 .

[129]  Dennis Prangle,et al.  Lazy ABC , 2014, Stat. Comput..

[130]  Andreas Hellander,et al.  Perspective: Stochastic algorithms for chemical kinetics. , 2013, The Journal of chemical physics.

[131]  Ruth E. Baker,et al.  Multilevel rejection sampling for approximate Bayesian computation , 2017, Comput. Stat. Data Anal..

[132]  Raul Cano On The Bayesian Bootstrap , 1992 .

[133]  Julien Cornebise,et al.  On optimality of kernels for approximate Bayesian computation using sequential Monte Carlo , 2011, Statistical applications in genetics and molecular biology.

[134]  Rachele Anderson,et al.  Approximate maximum likelihood estimation using data-cloning ABC , 2015, Comput. Stat. Data Anal..

[135]  Matthew A. Nunes,et al.  abctools: An R Package for Tuning Approximate Bayesian Computation Analyses , 2015, R J..

[136]  R. Wilkinson Approximate Bayesian computation (ABC) gives exact results under the assumption of model error , 2008, Statistical applications in genetics and molecular biology.

[137]  Matthew J Simpson,et al.  Mathematical models for cell migration with real-time cell cycle dynamics , 2017, bioRxiv.

[138]  C. Rao,et al.  Stochastic chemical kinetics and the quasi-steady-state assumption: Application to the Gillespie algorithm , 2003 .

[139]  Darren J. Wilkinson Stochastic Modelling for Systems Biology , 2006 .

[140]  Desmond J. Higham,et al.  Multilevel Monte Carlo for Continuous Time Markov Chains, with Applications in Biochemical Kinetics , 2011, Multiscale Model. Simul..

[141]  Yan Zhou,et al.  Bayesian Static Parameter Estimation for Partially Observed Diffusions via Multilevel Monte Carlo , 2017, SIAM J. Sci. Comput..

[142]  M. Elowitz,et al.  Functional roles for noise in genetic circuits , 2010, Nature.

[143]  Paul Fearnhead,et al.  Constructing summary statistics for approximate Bayesian computation: semi‐automatic approximate Bayesian computation , 2012 .

[144]  Desmond J. Higham,et al.  An introduction to multilevel Monte Carlo for option valuation , 2015, Int. J. Comput. Math..

[145]  G. Marion,et al.  Using model-based proposals for fast parameter inference on discrete state space, continuous-time Markov processes , 2015, Journal of The Royal Society Interface.

[146]  B. M. Fulk MATH , 1992 .

[147]  Linda R Petzold,et al.  The slow-scale stochastic simulation algorithm. , 2005, The Journal of chemical physics.

[148]  C. Andrieu,et al.  The pseudo-marginal approach for efficient Monte Carlo computations , 2009, 0903.5480.

[149]  Michael P.H. Stumpf,et al.  Approximate Bayesian inference for complex ecosystems , 2014, F1000prime reports.

[150]  H. Othmer,et al.  A stochastic analysis of first-order reaction networks , 2005, Bulletin of mathematical biology.

[151]  Raúl Tempone,et al.  A Multilevel Adaptive Reaction-splitting Simulation Method for Stochastic Reaction Networks , 2014, SIAM J. Sci. Comput..

[152]  Andrew J. Millar,et al.  Reconstruction of transcriptional dynamics from gene reporter data using differential equations , 2008, Bioinform..

[153]  S. Tavaré,et al.  Dating primate divergences through an integrated analysis of palaeontological and molecular data. , 2011, Systematic biology.

[154]  Jacob Beal,et al.  Reaction Factoring and Bipartite Update Graphs Accelerate the Gillespie Algorithm for Large-Scale Biochemical Systems , 2010, PloS one.