Riemannian Manifold Hamiltonian Monte Carlo

The paper proposes a Riemannian Manifold Hamiltonian Monte Carlo sampler to resolve the shortcomings of existing Monte Carlo algorithms when sampling from target densities that may be high dimensional and exhibit strong correlations. The method provides a fully automated adaptation mechanism that circumvents the costly pilot runs required to tune proposal densities for Metropolis-Hastings or indeed Hybrid Monte Carlo and Metropolis Adjusted Langevin Algorithms. This allows for highly efficient sampling even in very high dimensions where different scalings may be required for the transient and stationary phases of the Markov chain. The proposed method exploits the Riemannian structure of the parameter space of statistical models and thus automatically adapts to the local manifold structure at each step based on the metric tensor. A semi-explicit second order symplectic integrator for non-separable Hamiltonians is derived for simulating paths across this manifold which provides highly efficient convergence and exploration of the target density. The performance of the Riemannian Manifold Hamiltonian Monte Carlo method is assessed by performing posterior inference on logistic regression models, log-Gaussian Cox point processes, stochastic volatility models, and Bayesian estimation of parameter posteriors of dynamical systems described by nonlinear differential equations. Substantial improvements in the time normalised Effective Sample Size are reported when compared to alternative sampling approaches. Matlab code at \url{http://www.dcs.gla.ac.uk/inference/rmhmc} allows replication of all results.

[1]  Hoon Kim,et al.  Monte Carlo Statistical Methods , 2000, Technometrics.

[2]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[3]  Neil D. Lawrence,et al.  Accelerating Bayesian Inference over Nonlinear Differential Equations with Gaussian Processes , 2008, NIPS.

[4]  C. R. Rao,et al.  Information and the Accuracy Attainable in the Estimation of Statistical Parameters , 1992 .

[5]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[6]  I. Chavel Riemannian Geometry: Subject Index , 2006 .

[7]  Francis Sullivan,et al.  The Metropolis Algorithm , 2000, Computing in Science & Engineering.

[8]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[9]  Radford M. Neal Bayesian Learning via Stochastic Dynamics , 1992, NIPS.

[10]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[11]  P. Ferreira,et al.  Extending Fisher's measure of information , 1981 .

[12]  Radford M. Neal Probabilistic Inference Using Markov Chain Monte Carlo Methods , 2011 .

[13]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[14]  P Gustafson,et al.  Large hierarchical Bayesian analysis of multivariate survival data. , 1997, Biometrics.

[15]  J. Rosenthal,et al.  Scaling limits for the transient phase of local Metropolis–Hastings algorithms , 2005 .

[16]  Shun-ichi Amari,et al.  Differential-geometrical methods in statistics , 1985 .

[17]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[18]  H. Ishwaran Applications of Hybrid Monte Carlo to Bayesian Generalized Linear Models: Quasicomplete Separation and Neural Networks , 1999 .

[19]  J. Rosenthal,et al.  Optimal scaling of discrete approximations to Langevin diffusions , 1998 .

[20]  C. Holmes,et al.  Bayesian auxiliary variable models for binary and multinomial regression , 2006 .

[21]  Christophe Andrieu,et al.  A tutorial on adaptive MCMC , 2008, Stat. Comput..

[22]  Charles J. Geyer,et al.  Practical Markov Chain Monte Carlo , 1992 .

[23]  Jiguo Cao,et al.  Parameter estimation for differential equations: a generalized smoothing approach , 2007 .

[24]  E. Hairer,et al.  Geometric Numerical Integration , 2022, Oberwolfach Reports.

[25]  Robert K. Tsutakawa,et al.  Design of Experiment for Bioassay , 1972 .

[26]  G. Roberts,et al.  Langevin Diffusions and Metropolis-Hastings Algorithms , 2002 .

[27]  Amir Hajian,et al.  Efficient cosmological parameter estimation with Hamiltonian Monte Carlo technique , 2007 .

[28]  Tim Hesterberg,et al.  Monte Carlo Strategies in Scientific Computing , 2002, Technometrics.

[29]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[30]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[31]  Yoram Baram,et al.  Manifold Stochastic Dynamics for Bayesian Learning , 1999, Neural Computation.

[32]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[33]  Jim Albert,et al.  Ordinal Data Modeling , 2000 .

[34]  Ernst Hairer,et al.  Simulating Hamiltonian dynamics , 2006, Math. Comput..

[35]  R. Kass The Geometry of Asymptotic Inference , 1989 .

[36]  M. Schervish Theory of Statistics , 1995 .

[37]  Paul H. C. Eilers,et al.  Bayesian density estimation from grouped continuous data , 2009, Comput. Stat. Data Anal..