UvA-DARE ( Digital Academic Repository ) Austerity in MCMC Land : Cutting the Metropolis-Hastings

Can we make Bayesian posterior MCMC sampling more efficient when faced with very large datasets? We argue that computing the likelihood for N datapoints in the Metropolis-Hastings (MH) test to reach a single binary decision is computationally inefficient. We introduce an approximate MH rule based on a sequential hypothesis test that allows us to accept or reject samples with high confidence using only a fraction of the data required for the exact MH rule. While this method introduces an asymptotic bias, we show that this bias can be controlled and is more than offset by a decrease in variance due to our ability to draw more samples per unit of time.

[1]  John Odentrantz,et al.  Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues , 2000, Technometrics.

[2]  Yee Whye Teh,et al.  Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.

[3]  L. Lin,et al.  A noisy Monte Carlo algorithm , 1999, hep-lat/9905033.

[4]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[5]  P. Fearnhead,et al.  Particle filters for partially observed diffusions , 2007, 0710.4245.

[6]  Xiaohui Chen,et al.  A Bayesian Lasso via reversible-jump MCMC , 2011, Signal Process..

[7]  Z. Ouyang,et al.  Bayesian additive regression kernels , 2008 .

[8]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[9]  S. Pocock Group sequential methods in the design and analysis of clinical trials , 1977 .

[10]  Andrew McCallum,et al.  Monte Carlo MCMC: Efficient Inference by Approximate Sampling , 2012, EMNLP.

[11]  P. O'Brien,et al.  A multiple testing procedure for clinical trials. , 1979, Biometrics.

[12]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[13]  Andrzej Cichocki,et al.  A New Learning Algorithm for Blind Signal Separation , 1995, NIPS.

[14]  Ahn Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring , 2012 .

[15]  C. Andrieu,et al.  The pseudo-marginal approach for efficient Monte Carlo computations , 2009, 0903.5480.

[16]  A. Tsiatis,et al.  Approximately optimal one-parameter boundaries for group sequential trials. , 1987, Biometrics.

[17]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .