Fast Approximate Bayesian Computation for discretely observed Markov models using a factorised posterior distribution

Many modern statistical applications involve inference for complicated stochastic models for which the likelihood function is difficult or even impossible to calculate, and hence conventional likelihood-based inferential echniques cannot be used. In such settings, Bayesian inference can be performed using Approximate Bayesian Computation (ABC). However, in spite of many recent developments to ABC methodology, in many applications the computational cost of ABC necessitates the choice of summary statistics and tolerances that can potentially severely bias the estimate of the posterior. We propose a new "piecewise" ABC approach suitable for discretely observed Markov models that involves writing the posterior density of the parameters as a product of factors, each a function of only a subset of the data, and then using ABC within each factor. The approach has the advantage of side-stepping the need to choose a summary statistic and it enables a stringent tolerance to be set, making the posterior "less approximate". We investigate two methods for estimating the posterior density based on ABC samples for each of the factors: the first is to use a Gaussian approximation for each factor, and the second is to use a kernel density estimate. Both methods have their merits. The Gaussian approximation is simple, fast, and probably adequate for many applications. On the other hand, using instead a kernel density estimate has the benefit of consistently estimating the true ABC posterior as the number of ABC samples tends to infinity. We illustrate the piecewise ABC approach for three examples; in each case, the approach enables "exact matching" between simulations and data and offers fast and accurate inference.

[1]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[2]  Michael Isard,et al.  Nonparametric belief propagation , 2010, Commun. ACM.

[3]  Paul Marjoram,et al.  Markov chain Monte Carlo without likelihoods , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Mohamed Alosh,et al.  FIRST‐ORDER INTEGER‐VALUED AUTOREGRESSIVE (INAR(1)) PROCESS , 1987 .

[5]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[6]  Eddie McKenzie,et al.  Discrete variate time series , 2003 .

[7]  D. Sherrington Stochastic Processes in Physics and Chemistry , 1983 .

[8]  Sumeetpal S. Singh,et al.  Parameter Estimation for Hidden Markov Models with Intractable Likelihoods , 2011 .

[9]  M. Feldman,et al.  Population growth of human Y chromosomes: a study of Y chromosome microsatellites. , 1999, Molecular biology and evolution.

[10]  David Moriña,et al.  A statistical model for hospital admissions caused by seasonal diseases , 2011, Statistics in medicine.

[11]  Darren J. Wilkinson Stochastic Modelling for Systems Biology , 2006 .

[12]  D. Balding,et al.  Approximate Bayesian computation in population genetics. , 2002, Genetics.

[13]  Paul Fearnhead,et al.  Constructing Summary Statistics for Approximate Bayesian Computation: Semi-automatic ABC , 2010, 1004.1112.

[14]  Alex R Cook,et al.  The International Journal of Biostatistics Inference in Epidemic Models without Likelihoods , 2011 .

[15]  A. Doucet,et al.  Particle Markov chain Monte Carlo methods , 2010 .

[16]  Maria L. Rizzo,et al.  A new test for multivariate normality , 2005 .

[17]  S. Ross,et al.  A theory of the term structure of interest rates'', Econometrica 53, 385-407 , 1985 .

[18]  Darren J Wilkinson,et al.  Bayesian parameter inference for stochastic biochemical network models using particle Markov chain Monte Carlo , 2011, Interface Focus.

[19]  Darren J. Wilkinson,et al.  Bayesian inference for a discretely observed stochastic kinetic model , 2008, Stat. Comput..

[20]  P J Diggle,et al.  Spatio-temporal epidemiology of Campylobacter jejuni enteritis, in an area of Northwest England, 2000–2002 , 2010, Epidemiology and Infection.

[21]  David Welch,et al.  Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems , 2009, Journal of The Royal Society Interface.

[22]  Peter Neal,et al.  MCMC for Integer‐Valued ARMA processes , 2007 .

[23]  Olivier François,et al.  Non-linear regression models for Approximate Bayesian Computation , 2008, Stat. Comput..

[24]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[25]  Nicolas Chopin,et al.  Expectation Propagation for Likelihood-Free Inference , 2011, 1107.5959.

[26]  G. Pflug Kernel Smoothing. Monographs on Statistics and Applied Probability - M. P. Wand; M. C. Jones. , 1996 .

[27]  R. Wilkinson Approximate Bayesian computation (ABC) gives exact results under the assumption of model error , 2008, Statistical applications in genetics and molecular biology.

[28]  D. J. Nott,et al.  Approximate Bayesian computation via regression density estimation , 2012, 1212.1479.

[29]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[30]  Jean-Michel Marin,et al.  Approximate Bayesian computational methods , 2011, Statistics and Computing.