Spherical Hamiltonian Monte Carlo for Constrained Target Distributions

Statistical models with constrained probability distributions are abundant in machine learning. Some examples include regression models with norm constraints (e.g., Lasso), probit models, many copula models, and Latent Dirichlet Allocation (LDA) models. Bayesian inference involving probability distributions confined to constrained domains could be quite challenging for commonly used sampling algorithms. For such problems, we propose a novel Markov Chain Monte Carlo (MCMC) method that provides a general and computationally efficient framework for handling boundary conditions. Our method first maps the D-dimensional constrained domain of parameters to the unit ball [Formula: see text], then augments it to a D-dimensional sphere SD such that the original boundary corresponds to the equator of SD . This way, our method handles the constraints implicitly by moving freely on the sphere generating proposals that remain within boundaries when mapped back to the original space. To improve the computational efficiency of our algorithm, we divide the dynamics into several parts such that the resulting split dynamics has a partial analytical solution as a geodesic flow on the sphere. We apply our method to several examples including truncated Gaussian, Bayesian Lasso, Bayesian bridge regression, and a copula model for identifying synchrony among multiple neurons. Our results show that the proposed method can provide a natural and efficient framework for handling several types of constraints on target distributions.

[1]  Babak Shahbaba,et al.  Split Hamiltonian Monte Carlo , 2011, Stat. Comput..

[2]  J. M. Sanz-Serna,et al.  Hybrid Monte Carlo on Hilbert spaces , 2011 .

[3]  Raquel Urtasun,et al.  A Family of MCMC Methods on Implicitly Defined Manifolds , 2012, AISTATS.

[4]  M. West On scale mixtures of normal distributions , 1987 .

[5]  S. Duane,et al.  Hybrid Monte Carlo , 1987 .

[6]  R. Nelsen An Introduction to Copulas , 1998 .

[7]  W. K. Yuen,et al.  Optimal scaling of random walk Metropolis algorithms with discontinuous target densities , 2012, 1210.5090.

[8]  Chris Hans Bayesian lasso regression , 2009 .

[9]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[10]  Andrew Gelman,et al.  Handbook of Markov Chain Monte Carlo , 2011 .

[11]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[12]  M. Spivak A comprehensive introduction to differential geometry , 1979 .

[13]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[14]  D. J. G. Farlie,et al.  The performance of some correlation coefficients for a general bivariate distribution , 1960 .

[15]  M. Girolami,et al.  Geodesic Monte Carlo on Embedded Manifolds , 2013, Scandinavian journal of statistics, theory and applications.

[16]  Ari Pakman,et al.  Exact Hamiltonian Monte Carlo for Truncated Multivariate Gaussians , 2012, 1208.4118.

[17]  J. Friedman,et al.  [A Statistical View of Some Chemometrics Regression Tools]: Response , 1993 .

[18]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[19]  D. F. Andrews,et al.  Scale Mixtures of Normal Distributions , 1974 .

[20]  Charles J. Geyer,et al.  Practical Markov Chain Monte Carlo , 1992 .

[21]  Babak Shahbaba,et al.  A Semiparametric Bayesian Model for Neural Coding , 2013 .

[22]  G. Roberts,et al.  Optimal scaling of the random walk Metropolis on elliptically symmetric unimodal targets , 2009, 0909.0856.

[23]  G. Roberts,et al.  Optimal Scaling for Random Walk Metropolis on Spherically Constrained Target Densities , 2008 .

[24]  Ernst Hairer,et al.  Simulating Hamiltonian dynamics , 2006, Math. Comput..

[25]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[26]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[27]  John Eccleston,et al.  Statistics and Computing , 2006 .

[28]  E. Gumbel Bivariate Exponential Distributions , 1960 .

[29]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[30]  Babak Shahbaba,et al.  A Semiparametric Bayesian Model for Detecting Synchrony Among Multiple Neurons , 2013, Neural Computation.

[31]  M. Girolami,et al.  Riemann manifold Langevin and Hamiltonian Monte Carlo methods , 2011, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[32]  M. Girolami,et al.  Lagrangian Dynamical Monte Carlo , 2012, 1211.3759.