Adaptive Bayesian Sampling with Monte Carlo EM

We present a novel technique for learning the mass matrices in samplers obtained from discretized dynamics that preserve some energy function. Existing adaptive samplers use Riemannian preconditioning techniques, where the mass matrices are functions of the parameters being sampled. This leads to significant complexities in the energy reformulations and resultant dynamics, often leading to implicit systems of equations and requiring inversion of high-dimensional matrices in the leapfrog steps. Our approach provides a simpler alternative, by using existing dynamics in the sampling step of a Monte Carlo EM framework, and learning the mass matrices in the M step with a novel online technique. We also propose a way to adaptively set the number of samples gathered in the E step, using sampling error estimates from the leapfrog dynamics. Along with a novel stochastic sampler based on Nos\'{e}-Poincar\'{e} dynamics, we use this framework with standard Hamiltonian Monte Carlo (HMC) as well as newer stochastic algorithms such as SGHMC and SGNHT, and show strong performance on synthetic and real high-dimensional sampling scenarios; we achieve sampling accuracies comparable to Riemannian samplers while being significantly faster.

[1]  Yee Whye Teh,et al.  Stochastic Gradient Riemannian Langevin Dynamics on the Probability Simplex , 2013, NIPS.

[2]  R. Sherman,et al.  Conditions for convergence of Monte Carlo EM sequences with an application to product diffusion modeling , 1999 .

[3]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[4]  Caroline Uhler,et al.  Geometry of maximum likelihood estimation in Gaussian graphical models , 2010, 1012.2643.

[5]  Hoover,et al.  Canonical dynamics: Equilibrium phase-space distributions. , 1985, Physical review. A, General physics.

[6]  J. Dixon Exact solution of linear equations usingP-adic expansions , 1982 .

[7]  B. Leimkuhler,et al.  Molecular Dynamics: With Deterministic and Stochastic Numerical Methods , 2015 .

[8]  George Casella,et al.  Implementations of the Monte Carlo EM Algorithm , 2001 .

[9]  C. Robert,et al.  Convergence Controls for MCMC Algorithms with Applications to Hidden Markov Chains , 1999 .

[10]  B. Leimkuhler,et al.  Adaptive stochastic methods for sampling driven molecular systems. , 2011, The Journal of chemical physics.

[11]  Tianqi Chen,et al.  Stochastic Gradient Hamiltonian Monte Carlo , 2014, ICML.

[12]  Stephen D. Bond,et al.  The Nosé-Poincaré Method for Constant Temperature Molecular Dynamics , 1999 .

[13]  H. Robbins A Stochastic Approximation Method , 1951 .

[14]  M. Girolami,et al.  Riemann manifold Langevin and Hamiltonian Monte Carlo methods , 2011, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[15]  L. Yin,et al.  Existence and construction of dynamical potential in nonequilibrium processes without detailed balance , 2006 .

[16]  Brian Kulis,et al.  Gamma Processes, Stick-Breaking, and Variational Inference , 2015, AISTATS.

[17]  Gilles Villard,et al.  Solving sparse rational linear systems , 2006, ISSAC '06.

[18]  Gersende Fort,et al.  Convergence of the Monte Carlo expectation maximization for curved exponential families , 2003 .

[19]  S. Duane,et al.  Hybrid Monte Carlo , 1987 .

[20]  Yee Whye Teh,et al.  Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.

[21]  Berend Smit,et al.  Understanding molecular simulation: from algorithms to applications , 1996 .

[22]  K. Chan,et al.  Monte Carlo EM Estimation for Time Series Models Involving Counts , 1995 .

[23]  J. Booth,et al.  Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm , 1999 .

[24]  Lawrence Carin,et al.  Negative Binomial Process Count and Mixture Modeling , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Nitish Srivastava,et al.  Modeling Documents with Deep Boltzmann Machines , 2013, UAI.

[26]  Berend Smit,et al.  Understanding Molecular Simulations: from Algorithms to Applications , 2002 .

[27]  Numerische Mathematik Exact Solution of Linear Equations Using P-Adie Expansions* , 2005 .

[28]  Ryan Babbush,et al.  Bayesian Sampling Using Stochastic Gradient Thermostats , 2014, NIPS.

[29]  Ernst Hairer,et al.  Simulating Hamiltonian dynamics , 2006, Math. Comput..

[30]  C. McCulloch Maximum Likelihood Algorithms for Generalized Linear Mixed Models , 1997 .

[31]  G. C. Wei,et al.  A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms , 1990 .

[32]  Léon Bottou,et al.  On-line learning and stochastic approximations , 1999 .

[33]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[34]  Srinivasan Parthasarathy,et al.  Robust Monte Carlo Sampling using Riemannian Nosé-Poincaré Hamiltonian Dynamics , 2016, ICML.