Continuously Tempered Hamiltonian Monte Carlo

Hamiltonian Monte Carlo (HMC) is a powerful Markov chain Monte Carlo (MCMC) method for performing approximate inference in complex probabilistic models of continuous variables. In common with many MCMC methods, however, the standard HMC approach performs poorly in distributions with multiple isolated modes. We present a method for augmenting the Hamiltonian system with an extra continuous temperature control variable which allows the dynamic to bridge between sampling a complex target distribution and a simpler unimodal base distribution. This augmentation both helps improve mixing in multimodal targets and allows the normalisation constant of the target distribution to be estimated. The method is simple to implement within existing HMC code, requiring only a standard leapfrog integrator. We demonstrate experimentally that the method is competitive with annealed importance sampling and simulating tempering methods at sampling from challenging multimodal distributions and estimating their normalising constants.

[1]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[2]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[3]  Stochastic Relaxation , 2014, Computer Vision, A Reference Guide.

[4]  Ernst Hairer,et al.  Simulating Hamiltonian dynamics , 2006, Math. Comput..

[5]  M. Betancourt,et al.  Adiabatic Monte Carlo , 2014 .

[6]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[7]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[8]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[9]  David E. Carlson,et al.  Partition Functions from Rao-Blackwellized Tempered Sampling , 2016, ICML.

[10]  Xiao-Li Meng,et al.  Simulating Normalizing Constants: From Importance Sampling to Bridge Sampling to Path Sampling , 1998 .

[11]  Radford M. Neal Sampling from multimodal distributions using tempered transitions , 1996, Stat. Comput..

[12]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[13]  S. Duane,et al.  Hybrid Monte Carlo , 1987 .

[14]  Joseph Hilbe,et al.  Data Analysis Using Regression and Multilevel/Hierarchical Models , 2009 .

[15]  P. Damlen,et al.  Gibbs sampling for Bayesian non‐conjugate and hierarchical models by using auxiliary variables , 1999 .

[16]  Michael Betancourt,et al.  A General Metric for Riemannian Manifold Hamiltonian Monte Carlo , 2012, GSI.

[17]  Gianpaolo Gobbo,et al.  Extended Hamiltonian approach to continuous tempering. , 2015, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[19]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[20]  Ruslan Salakhutdinov,et al.  On the quantitative analysis of deep belief networks , 2008, ICML '08.

[21]  Dustin Tran,et al.  Variational Inference via \chi Upper Bound Minimization , 2016, NIPS.

[22]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  George Casella,et al.  A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data , 2008, 0808.2902.

[24]  Ruslan Salakhutdinov,et al.  On the Quantitative Analysis of Decoder-Based Generative Models , 2016, ICLR.

[25]  Ryan P. Adams,et al.  Sandwiching the marginal likelihood using bidirectional Monte Carlo , 2015, ArXiv.

[26]  C. Geyer Markov Chain Monte Carlo Maximum Likelihood , 1991 .

[27]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[28]  Dustin Tran,et al.  Automatic Differentiation Variational Inference , 2016, J. Mach. Learn. Res..

[29]  G. Parisi,et al.  Simulated tempering: a new Monte Carlo scheme , 1992, hep-lat/9205018.

[30]  Radford M. Neal Slice Sampling , 2003, The Annals of Statistics.

[31]  A. Laio,et al.  Escaping free-energy minima , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Wang,et al.  Replica Monte Carlo simulation of spin glasses. , 1986, Physical review letters.

[33]  C. Geyer,et al.  Annealing Markov chain Monte Carlo with applications to ancestral inference , 1995 .

[34]  O. Zobay Mean field inference for the Dirichlet process mixture model , 2009 .

[35]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[36]  Nando de Freitas,et al.  Variational MCMC , 2001, UAI.

[37]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[38]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[39]  Dustin Tran,et al.  The $χ$-Divergence for Approximate Inference , 2016, ArXiv.

[40]  Max Welling,et al.  Markov Chain Monte Carlo and Variational Inference: Bridging the Gap , 2014, ICML.

[41]  Jiqiang Guo,et al.  Stan: A Probabilistic Programming Language. , 2017, Journal of statistical software.

[42]  Nial Friel,et al.  Tuning tempered transitions , 2010, Stat. Comput..

[43]  Richard E. Turner,et al.  Rényi Divergence Variational Inference , 2016, NIPS.

[44]  John Salvatier,et al.  Probabilistic programming in Python using PyMC3 , 2016, PeerJ Comput. Sci..

[45]  Michael W Deem,et al.  Parallel tempering: theory, applications, and new perspectives. , 2005, Physical chemistry chemical physics : PCCP.

[46]  Radford M. Neal Annealed importance sampling , 1998, Stat. Comput..

[47]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[48]  Massimiliano Bonomi,et al.  Reconstructing the equilibrium Boltzmann distribution from well‐tempered metadynamics , 2009, J. Comput. Chem..

[49]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[50]  Yichuan Zhang,et al.  Continuous Relaxations for Discrete Hamiltonian Monte Carlo , 2012, NIPS.