Finding our Way in the Dark: Approximate MCMC for Approximate Bayesian Methods

With larger data at their disposal, scientists are emboldened to tackle complex questions that require sophisticated statistical models. It is not unusual for the latter to have likelihood functions that elude analytical formulations. Even under such adversity, when one can simulate from the sampling distribution, Bayesian analysis can be conducted using approximate methods such as Approximate Bayesian Computation (ABC) or Bayesian Synthetic Likelihood (BSL). A significant drawback of these methods is that the number of required simulations can be prohibitively large, thus severely limiting their scope. In this paper we design perturbed MCMC samplers that can be used within the ABC and BSL paradigms to significantly accelerate computation while maintaining control on computational efficiency. The proposed strategy relies on recycling samples from the chain's past. The algorithmic design is supported by a theoretical analysis while practical performance is examined via a series of simulation examples and data analyses.

[1]  C. Andrieu,et al.  Convergence properties of pseudo-marginal Markov chain Monte Carlo algorithms , 2012, 1210.1484.

[2]  Dennis Prangle,et al.  Summary Statistics in Approximate Bayesian Computation , 2015, 1512.05633.

[3]  Holger Dette,et al.  Of Copulas, Quantiles, Ranks and Spectra - An L1-Approach to Spectral Analysis , 2011, 1111.7205.

[4]  Eric D. Kolaczyk,et al.  Statistical Analysis of Network Data with R , 2020, Use R!.

[5]  S. A. Sisson,et al.  Overview of Approximate Bayesian Computation , 2018, 1802.09720.

[6]  Jonathan C. Mattingly,et al.  Optimal approximating Markov chains for Bayesian inference , 2015, 1508.03387.

[7]  N. Pillai,et al.  Ergodicity of Approximate MCMC Chains with Applications to Large Data Sets , 2014, 1405.0182.

[8]  Anthony N. Pettitt,et al.  Discussion of : constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation , 2012 .

[9]  Jonathan C. Mattingly,et al.  Error bounds for Approximations of Markov chains , 2017 .

[10]  R. Morgan Genetics and molecular biology. , 1995, Current opinion in lipidology.

[11]  Max Welling,et al.  GPS-ABC: Gaussian Process Surrogate Approximate Bayesian Computation , 2014, UAI.

[12]  Richard G. Everitt Bootstrapped synthetic likelihood , 2017, ArXiv.

[13]  Andrew Gelman,et al.  Handbook of Markov Chain Monte Carlo , 2011 .

[14]  Pierre Alquier,et al.  Noisy Monte Carlo: convergence of Markov chains with approximate transition kernels , 2014, Statistics and Computing.

[15]  M. Beaumont Estimation of population growth or decline in genetically monitored populations. , 2003, Genetics.

[16]  S. Wood Statistical inference for noisy nonlinear ecological dynamic systems , 2010, Nature.

[17]  Klaus Nordhausen,et al.  Statistical Analysis of Network Data with R , 2015 .

[18]  J.-M. Marin,et al.  Relevant statistics for Bayesian model choice , 2011, 1110.4700.

[19]  M. Marchesi,et al.  VOLATILITY CLUSTERING IN FINANCIAL MARKETS: A MICROSIMULATION OF INTERACTING AGENTS , 2000 .

[20]  David J. Nott,et al.  Robust Bayesian synthetic likelihood via a semi-parametric approach , 2018, Stat. Comput..

[21]  Radford M. Neal,et al.  MCMC for non-Linear State Space Models Using Ensembles of Latent Sequences , 2013 .

[22]  BentleyJon Louis Multidimensional binary search trees used for associative searching , 1975 .

[23]  Luc Devroye,et al.  Lectures on the Nearest Neighbor Method , 2015 .

[24]  Andrew Golightly,et al.  Adaptive, Delayed-Acceptance MCMC for Targets With Expensive Likelihoods , 2015, 1509.00172.

[25]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[26]  R. Kohn,et al.  Speeding Up MCMC by Efficient Data Subsampling , 2014, Journal of the American Statistical Association.

[27]  J. Rosenthal,et al.  Optimal scaling for various Metropolis-Hastings algorithms , 2001 .

[28]  Jean-Michel Marin,et al.  Approximate Bayesian computational methods , 2011, Statistics and Computing.

[29]  Leonhard Held,et al.  Gaussian Markov Random Fields: Theory and Applications , 2005 .

[30]  T. Guhr,et al.  Quantile Correlations: Uncovering temporal dependencies in financial time series , 2015, 1507.04990.

[31]  Gareth O. Roberts,et al.  Convergence Properties of Perturbed Markov Chains , 1998, Journal of Applied Probability.

[32]  David T. Frazier,et al.  Bayesian Synthetic Likelihood , 2017, 2305.05120.

[33]  Edward I. George,et al.  Bayes and big data: the consensus Monte Carlo algorithm , 2016, Big Data and Information Theory.

[34]  Xiangyu Wang,et al.  Parallelizing MCMC via Weierstrass Sampler , 2013, 1312.4605.

[35]  R. Tweedie,et al.  Rates of convergence of the Hastings and Metropolis algorithms , 1996 .

[36]  Arnaud Doucet,et al.  Towards scaling up Markov chain Monte Carlo: an adaptive subsampling approach , 2014, ICML.

[37]  S. Mukherjee,et al.  Approximations of Markov Chains and High-Dimensional Bayesian Inference , 2015 .

[38]  Christian P Robert,et al.  Lack of confidence in approximate Bayesian computation model choice , 2011, Proceedings of the National Academy of Sciences.

[39]  S. Rachev Handbook of heavy tailed distributions in finance , 2003 .

[40]  J. Nolan,et al.  Modeling financial data with stable distributions , 2003 .

[41]  Radu V. Craiu,et al.  Likelihood inflating sampling algorithm , 2016, 1605.02113.

[42]  Richard Wilkinson,et al.  Accelerating ABC methods using Gaussian processes , 2014, AISTATS.

[43]  Paul Fearnhead,et al.  Constructing summary statistics for approximate Bayesian computation: semi‐automatic approximate Bayesian computation , 2012 .

[44]  Yanan Fan,et al.  Handbook of Approximate Bayesian Computation , 2018 .

[45]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[46]  Radu V. Craiu,et al.  Bayesian Computation Via Markov Chain Monte Carlo , 2014 .

[47]  Anthony Lee,et al.  On the choice of MCMC kernels for approximate Bayesian computation with SMC samplers , 2012, Proceedings Title: Proceedings of the 2012 Winter Simulation Conference (WSC).

[48]  A. Gelman,et al.  Weak convergence and optimal scaling of random walk Metropolis algorithms , 1997 .

[49]  C. Andrieu,et al.  The pseudo-marginal approach for efficient Monte Carlo computations , 2009, 0903.5480.

[50]  Christopher C Drovandi,et al.  ABC and Indirect Inference , 2018, Handbook of Approximate Bayesian Computation.

[51]  Paul Marjoram,et al.  Markov chain Monte Carlo without likelihoods , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[52]  A. Y. Mitrophanov,et al.  Sensitivity and convergence of uniformly ergodic Markov chains , 2005 .

[53]  P. Cheng Strong consistency of nearest neighbor regression function estimators , 1984 .

[54]  Jonathan C. Mattingly,et al.  Error bounds for Approximations of Markov chains used in Bayesian Sampling , 2017, 1711.05382.

[55]  Efficient MCMC for Gibbs Random Fields using pre-computation , 2017, 1710.04093.

[56]  Aki Vehtari,et al.  Efficient Acquisition Rules for Model-Based Approximate Bayesian Computation , 2017, Bayesian Analysis.

[57]  D. Madigan,et al.  A one-pass sequential Monte Carlo method for Bayesian analysis of massive datasets , 2006 .

[58]  Christopher C. Drovandi,et al.  Approximating the Likelihood in ABC , 2018, Handbook of Approximate Bayesian Computation.

[59]  Julien Cornebise,et al.  On optimality of kernels for approximate Bayesian computation using sequential Monte Carlo , 2011, Statistical applications in genetics and molecular biology.

[60]  A. Doucet,et al.  Particle Markov chain Monte Carlo methods , 2010 .

[61]  Ryan P. Adams,et al.  Firefly Monte Carlo: Exact MCMC with Subsets of Data , 2014, UAI.

[62]  Christopher C. Drovandi,et al.  Accelerating pseudo-marginal MCMC using Gaussian processes , 2018, Comput. Stat. Data Anal..

[63]  Mark M. Tanaka,et al.  Sequential Monte Carlo without likelihoods , 2007, Proceedings of the National Academy of Sciences.

[64]  Luke Bornn,et al.  One Pseudo-Sample is Enough in Approximate Bayesian Computation MCMC , 2014 .

[65]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.