An Adaptive Subsampling Approach for MCMC Inference in Large Datasets

Markov chain Monte Carlo (MCMC) methods are often deemed far too computationally intensive to be of any practical use for large datasets. This paper describes a methodology that aims to scale up the Metropolis-Hastings (MH) algorithm in this context. We propose an approximate implementation of the accept/reject step of MH that only requires evaluating the likelihood of a random subset of the data, yet is guaranteed to coincide with the accept/reject step based on the full dataset with a probability superior to a user-specified tolerance level. This adaptive subsampling technique is an alternative to the recent approach developed in [15], and it allows us to establish rigorously that the resulting approximate MH algorithm samples from a perturbed version of the target distribution of interest, whose total variation distance to this very target is controlled explicitly. We explore the benefits and limitations of this scheme on several examples.

[1]  J. Sanders,et al.  Integrating a modified simulated annealing algorithm with the simulation of a manufacturing system to optimize buffer sizes in automatic assembly systems , 1988, 1988 Winter Simulation Conference Proceedings.

[2]  A. V. D. Vaart,et al.  Asymptotic Statistics: U -Statistics , 1998 .

[3]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[4]  Samy Bengio,et al.  A Parallel Mixture of SVMs for Very Large Scale Problems , 2001, Neural Computation.

[5]  Csaba Szepesvári,et al.  Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..

[6]  R. Serfling Probability Inequalities for the Sum in Sampling without Replacement , 1974 .

[7]  James Ledoux,et al.  Regular Perturbation of V-Geometrically Ergodic Markov Chains , 2012, Journal of Applied Probability.

[8]  Andrew McCallum,et al.  Monte Carlo MCMC: Efficient Inference by Approximate Sampling , 2012, EMNLP.

[9]  Csaba Szepesvari,et al.  Efficient Stopping Rules , 2008 .

[10]  H. Haario,et al.  An adaptive Metropolis algorithm , 2001 .

[11]  Randal Douc,et al.  Nonlinear Time Series: Theory, Methods and Applications with R Examples , 2014 .

[12]  Nicolò Cesa-Bianchi,et al.  Combinatorial Bandits , 2012, COLT.

[13]  Talal M. Alkhamis,et al.  Simulated annealing for discrete optimization with estimation , 1999, Eur. J. Oper. Res..

[14]  Odalric-Ambrym Maillard,et al.  Concentration inequalities for sampling without replacement , 2013, 1309.4029.

[15]  Osamu Watanabe,et al.  MadaBoost: A Modification of AdaBoost , 2000, COLT.

[16]  Christophe Andrieu,et al.  A tutorial on adaptive MCMC , 2008, Stat. Comput..

[17]  C. Andrieu,et al.  The pseudo-marginal approach for efficient Monte Carlo computations , 2009, 0903.5480.

[18]  M. G. Pittau,et al.  A weakly informative default prior distribution for logistic and other regression models , 2008, 0901.4011.

[19]  Max Welling,et al.  Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget , 2013, ICML 2014.

[20]  Liang Zhang,et al.  Stochastic optimization using simulated annealing with hypothesis test , 2006, Appl. Math. Comput..

[21]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[22]  Csaba Szepesvári,et al.  Empirical Bernstein stopping , 2008, ICML '08.

[23]  Andrew W. Moore,et al.  Hoeffding Races: Accelerating Model Selection Search for Classification and Function Approximation , 1993, NIPS.

[24]  J. Rosenthal,et al.  Optimal scaling for various Metropolis-Hastings algorithms , 2001 .