论文信息 - An Efficient Minibatch Acceptance Test for Metropolis-Hastings - 字舞流文

An Efficient Minibatch Acceptance Test for Metropolis-Hastings

We present a novel Metropolis-Hastings method for large datasets that uses small expected-size minibatches of data. Previous work on reducing the cost of Metropolis-Hastings tests yield variable data consumed per sample, with only constant factor reductions versus using the full dataset for each sample. Here we present a method that can be tuned to provide arbitrarily small batch sizes, by adjusting either proposal step size or temperature. Our test uses the noise-tolerant Barker acceptance test with a novel additive correction variable. The resulting test has similar cost to a normal SGD update. Our experiments demonstrate several order-of-magnitude speedups over previous work.

John Canny | Daniel Seita | Haoyu Chen | Xinlei Pan | J. Canny | Xinlei Pan | Daniel Seita | Haoyu Chen

[1] W. K. Hastings,et al. Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[2] Friedrich Götze,et al. An Edgeworth expansion for symmetric statistics , 1997 .

[3] Journal of Chemical Physics , 1932, Nature.

[4] Arnaud Doucet,et al. Towards scaling up Markov chain Monte Carlo: an adaptive subsampling approach , 2014, ICML.

[5] Andrew Gelman,et al. Handbook of Markov Chain Monte Carlo , 2011 .

[6] Tianqi Chen,et al. Stochastic Gradient Hamiltonian Monte Carlo , 2014, ICML.

[7] Ahn,et al. Bayesian posterior sampling via stochastic gradient Fisher scoring Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring , 2012 .

[8] Max Welling,et al. Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget , 2013, ICML 2014.

[9] N. Metropolis,et al. Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[10] Michael Betancourt,et al. A Conceptual Introduction to Hamiltonian Monte Carlo , 2017, 1701.02434.

[11] Zoubin Ghahramani,et al. Scalable Discrete Sampling as a Multi-Armed Bandit Problem , 2015, ICML.

[12] Christophe Dupuy,et al. Online but Accurate Inference for Latent Variable Models with Local Gibbs Sampling , 2016, J. Mach. Learn. Res..

[13] G. Uhlenbeck,et al. On the Theory of the Brownian Motion II , 1945 .

[14] Ahn. Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring , 2012 .

[15] Tianqi Chen,et al. A Complete Recipe for Stochastic Gradient MCMC , 2015, NIPS.

[16] Radford M. Neal. MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[17] A. Shapiro. Monte Carlo Sampling Methods , 2003 .

[18] S. Canu,et al. Training Invariant Support Vector Machines using Selective Sampling , 2005 .

[19] Babak Shahbaba,et al. Distributed Stochastic Gradient MCMC , 2014, ICML.

[20] Serguei Novak,et al. On self-normalized sums and student's statistic , 2005 .

[21] Richard E. Turner,et al. Magnetic Hamiltonian Monte Carlo , 2016, ICML.

[22] Eddie Kohler,et al. Accelerating MCMC via Parallel Predictive Prefetching , 2014, UAI.

[23] A. Barker. Monte Carlo calculations of the radial distribution functions for a proton-electron plasma , 1965 .

[24] Peter Green,et al. Markov chain Monte Carlo in Practice , 1996 .

[25] Yann LeCun,et al. The Loss Surface of Multilayer Networks , 2014, ArXiv.

[26] J. Rosenthal,et al. Optimal scaling for various Metropolis-Hastings algorithms , 2001 .

[27] Yee Whye Teh,et al. Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.

[28] Arnaud Doucet,et al. On Markov chain Monte Carlo methods for tall data , 2015, J. Mach. Learn. Res..

[29] Ryan P. Adams,et al. Firefly Monte Carlo: Exact MCMC with Subsets of Data , 2014, UAI.

[30] Andrew Gelman,et al. The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..