A tutorial on adaptive MCMC

We review adaptive Markov chain Monte Carlo algorithms (MCMC) as a mean to optimise their performance. Using simple toy examples we review their theoretical underpinnings, and in particular show why adaptive MCMC algorithms might fail when some fundamental properties are not satisfied. This leads to guidelines concerning the design of correct algorithms. We then review criteria and the useful framework of stochastic approximation, which allows one to systematically optimise generally used criteria, but also analyse the properties of adaptive MCMC algorithms. We then propose a series of novel adaptive algorithms which prove to be robust and reliable in practice. These algorithms are applied to artificial and high dimensional scenarios, but also to the classic mine disaster dataset inference problem.

[1]  D. Ceperley,et al.  Monte Carlo simulation of a many-fermion study , 1977 .

[2]  Han-Fu Chen,et al.  Convergence and robustness of the Robbins-Monro algorithm truncated at randomly varying bounds , 1987 .

[3]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[4]  V. Borkar Topics in controlled Markov chains , 1991 .

[5]  J. Besag,et al.  Spatial Statistics and Bayesian Computation , 1993 .

[6]  Bernard Delyon,et al.  Accelerated Stochastic Approximation , 1993, SIAM J. Optim..

[7]  Peter Green,et al.  Spatial statistics and Bayesian computation (with discussion) , 1993 .

[8]  Walter R. Gilks,et al.  Adaptive Direction Sampling , 1994 .

[9]  A. Gelfand,et al.  On Markov Chain Monte Carlo Acceleration , 1994 .

[10]  N. Shephard,et al.  Stochastic Volatility: Likelihood Inference And Comparison With Arch Models , 1996 .

[11]  C. Geyer,et al.  Annealing Markov chain Monte Carlo with applications to ancestral inference , 1995 .

[12]  Sigrún Andradóttir,et al.  A Stochastic Approximation Algorithm with Varying Bounds , 1995, Oper. Res..

[13]  Bin Yu,et al.  Regeneration in Markov chain samplers , 1995 .

[14]  Walter R. Gilks,et al.  MCMC for nonlinear hierarchical models , 1995 .

[15]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[16]  B. Delyon General results on the convergence of stochastic algorithms , 1996, IEEE Trans. Autom. Control..

[17]  S. Chib,et al.  Posterior Simulation and Bayes Factors in Panel Count Data Models , 1998 .

[18]  A. Gelman,et al.  Weak convergence and optimal scaling of random walk Metropolis algorithms , 1997 .

[19]  Sam T. Roweis,et al.  EM Algorithms for PCA and SPCA , 1997, NIPS.

[20]  C. Sims ADAPTIVE METROPOLIS-HASTINGS SAMPLING, OR MONTE CARLO KERNEL ESTIMATION , 1998 .

[21]  J. Spall Adaptive stochastic approximation by the simultaneous perturbation method , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).

[22]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[23]  J. Rosenthal,et al.  Optimal scaling of discrete approximations to Langevin diffusions , 1998 .

[24]  A Ramponi,et al.  Stochastic adaptive selection of weightsin the simulated tempering algorithm , 1998 .

[25]  G. Roberts,et al.  Adaptive Markov Chain Monte Carlo through Regeneration , 1998 .

[26]  L Tierney,et al.  Some adaptive monte carlo methods for Bayesian inference. , 1999, Statistics in medicine.

[27]  R. Tweedie,et al.  Langevin-Type Models II: Self-Targeting Candidates for MCMC Algorithms* , 1999 .

[28]  Heikki Haario,et al.  Adaptive proposal distribution for random walk Metropolis algorithm , 1999, Comput. Stat..

[29]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[30]  P. Green,et al.  Trans-dimensional Markov chain Monte Carlo , 2000 .

[31]  William J. Browne,et al.  Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models , 2000, Comput. Stat..

[32]  Duker Discussion I , 1993, Ophthalmology.

[33]  Jun S. Liu,et al.  The Multiple-Try Method and Local Optimization in Metropolis Sampling , 2000 .

[34]  P. Green,et al.  Delayed rejection in reversible jump Metropolis–Hastings , 2001 .

[35]  Nando de Freitas,et al.  Variational MCMC , 2001, UAI.

[36]  C. Robert,et al.  Controlled MCMC for Optimal Sampling , 2001 .

[37]  H. Haario,et al.  An adaptive Metropolis algorithm , 2001 .

[38]  S. Douglas Simple adaptive algorithms for Cholesky, LDL/sup T/, QR, and eigenvalue decompositions of autocorrelation matrices for sensor array data , 2001, Conference Record of Thirty-Fifth Asilomar Conference on Signals, Systems and Computers (Cat.No.01CH37256).

[39]  D. Chauveau,et al.  Improving Convergence of the Hastings–Metropolis Algorithm with an Adaptive Proposal , 2002 .

[40]  Gerhard Winkler,et al.  Image Analysis, Random Fields and Markov Chain Monte Carlo Methods: A Mathematical Introduction , 2002 .

[41]  J. Gåsemyr On an adaptive version of the Metropolis-Hastings algorithm with independent proposal distribution , 2003 .

[42]  Jong-Hoon Ahn,et al.  A Constrained EM Algorithm for Principal Component Analysis , 2003, Neural Computation.

[43]  S. Erland On Eigen-decompositions and Adaptivity of Markov Chains , 2003 .

[44]  Peter Green,et al.  Highly Structured Stochastic Systems , 2003 .

[45]  G. Roberts,et al.  Efficient construction of reversible jump Markov chain Monte Carlo proposal distributions , 2003 .

[46]  A. Plakhov,et al.  A Stochastic Approximation Algorithm with Step-Size Adaptation , 2004 .

[47]  Kathryn B. Laskey,et al.  Population Markov Chain Monte Carlo , 2004, Machine Learning.

[48]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[49]  H. Haario,et al.  Markov chain Monte Carlo methods for high dimensional inversion in remote sensing , 2004 .

[50]  J. Gåsemyr,et al.  An Application of Adaptive Independent Chain Metropolis–Hastings Algorithms in Bayesian Hazard Rate Estimation , 2004 .

[51]  David Hastie,et al.  Towards Automatic Reversible Jump Markov Chain Monte Carlo , 2005 .

[52]  Heikki Haario,et al.  Componentwise adaptation for high dimensional MCMC , 2005, Comput. Stat..

[53]  Christian P. Robert,et al.  Monte Carlo Statistical Methods (Springer Texts in Statistics) , 2005 .

[54]  E.S. Sousa,et al.  An em-based subspace tracker for wireless communication applications , 2005, VTC-2005-Fall. 2005 IEEE 62nd Vehicular Technology Conference, 2005..

[55]  Eric Moulines,et al.  Stability of Stochastic Approximation under Verifiable Conditions , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[56]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[57]  R. Kohn,et al.  Efficient Bayesian Inference for Multiple Change-Point and Mixture Innovation Models , 2005 .

[58]  J. Rosenthal,et al.  On adaptive Markov chain Monte Carlo algorithms , 2005 .

[59]  David J. Nott,et al.  Adaptive sampling for Bayesian variable selection , 2005 .

[60]  Y. Atchadé An Adaptive Version for the Metropolis Adjusted Langevin Algorithm with a Truncated Drift , 2006 .

[61]  Sveriges Riksbank Efficient Bayesian inference for multiple change-point and mixture innovation models , 2006 .

[62]  B. Jourdain,et al.  Does waste-recycling really improve Metropolis-Hastings Monte Carlo algorithm? , 2006, math/0611949.

[63]  C. Andrieu,et al.  On the ergodicity properties of some adaptive MCMC algorithms , 2006, math/0610317.

[64]  D. Frenkel Waste-Recycling Monte Carlo , 2006 .

[65]  Heikki Haario,et al.  DRAM: Efficient adaptive MCMC , 2006, Stat. Comput..

[66]  H. Robbins A Stochastic Approximation Method , 1951 .

[67]  J. Rosenthal,et al.  Coupling and Ergodicity of Adaptive Markov Chain Monte Carlo Algorithms , 2007, Journal of Applied Probability.

[68]  Jeffrey S. Rosenthal,et al.  Coupling and Ergodicity of Adaptive MCMC , 2007 .

[69]  C. Andrieu,et al.  On the efficiency of adaptive MCMC algorithms , 2007 .

[70]  M. B'edard Weak convergence of Metropolis algorithms for non-i.i.d. target distributions , 2007, 0710.3684.

[71]  Jean-Michel Marin,et al.  Adaptive importance sampling in general mixture classes , 2007, Stat. Comput..

[72]  M. Bédard Optimal acceptance rates for Metropolis algorithms: Moving beyond 0.234 , 2008 .

[73]  G. Fort,et al.  Limit theorems for some adaptive MCMC algorithms with subgeometric kernels , 2008, 0807.2952.

[74]  J. Rosenthal,et al.  Department of , 1993 .

[75]  Gareth O. Roberts,et al.  Examples of Adaptive MCMC , 2009 .

[76]  G. Roberts,et al.  Optimal scaling of the random walk Metropolis on elliptically symmetric unimodal targets , 2009, 0909.0856.

[77]  E. Saksman,et al.  On the ergodicity of the adaptive Metropolis algorithm on unbounded domains , 2008, 0806.2933.

[78]  Jun S. Liu,et al.  The Wang-Landau algorithm in general state spaces: Applications and convergence analysis , 2010 .