MCMC Methods for Functions: ModifyingOld Algorithms to Make Them Faster

Many problems arising in applications result in the need to probe a probability distribution for functions. Examples include Bayesian nonparametric statistics and conditioned diffusion processes. Standard MCMC algorithms typically become arbitrarily slow under the mesh refinement dictated by nonparametric description of the un- known function. We describe an approach to modifying a whole range of MCMC methods, applicable whenever the target measure has density with respect to a Gaussian process or Gaussian random field reference measure, which ensures that their speed of convergence is robust under mesh refinement. Gaussian processes or random fields are fields whose marginal distri- butions, when evaluated at any finite set of N points, are RN-valued Gaussians. The algorithmic approach that we describe is applicable not only when the desired probability measure has density with respect to a Gaussian process or Gaussian random field reference measure, but also to some useful non-Gaussian reference measures constructed through random truncation. In the applications of interest the data is often sparse and the prior specification is an essential part of the over- all modelling strategy. These Gaussian-based reference measures are a very flexible modelling tool, finding wide-ranging application. Examples are shown in density estimation, data assimilation in fluid mechanics, subsurface geophysics and image registration. The key design principle is to formulate the MCMC method so that it is, in principle, applicable for functions; this may be achieved by use of proposals based on carefully chosen time-discretizations of stochas- tic dynamical systems which exactly preserve the Gaussian reference measure. Taking this approach leads to many new algorithms which can be implemented via minor modification of existing algorithms, yet which show enormous speed-up on a wide range of applied problems.

[1]  J. Doob Stochastic processes , 1953 .

[2]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[3]  R. D. Richtmyer,et al.  Difference methods for initial-value problems , 1959 .

[4]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[5]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[6]  H. Künsch Gaussian Markov random fields , 1979 .

[7]  W. J. Thron,et al.  Encyclopedia of Mathematics and its Applications. , 1982 .

[8]  Jacob Kogan,et al.  The nonlinear case , 1986 .

[9]  S. Duane,et al.  Hybrid Monte Carlo , 1987 .

[10]  Adrian F. M. Smith,et al.  Bayesian computation via the gibbs sampler and related markov chain monte carlo methods (with discus , 1993 .

[11]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[12]  R. Tweedie,et al.  Exponential convergence of Langevin distributions and their discrete approximations , 1996 .

[13]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[14]  D. McLaughlin,et al.  A Reassessment of the Groundwater Inverse Problem , 1996 .

[15]  A. Gelman,et al.  Weak convergence and optimal scaling of random walk Metropolis algorithms , 1997 .

[16]  A. Sokal Monte Carlo Methods in Statistical Mechanics: Foundations and New Algorithms , 1997 .

[17]  Xiao-Li Meng,et al.  The EM Algorithm—an Old Folk‐song Sung to a Fast New Tune , 1997 .

[18]  L. Tierney A note on Metropolis-Hastings kernels for general state spaces , 1998 .

[19]  G. Akrivis A First Course In The Numerical Analysis Of Differential Equations [Book News & Reviews] , 1998, IEEE Computational Science and Engineering.

[20]  B. Silverman,et al.  Wavelet thresholding via a Bayesian approach , 1998 .

[21]  James O. Berger,et al.  Uncertainty analysis and other inference tools for complex computer codes , 1998 .

[22]  J. Rosenthal,et al.  Optimal scaling of discrete approximations to Langevin diffusions , 1998 .

[23]  Michael L. Stein,et al.  Interpolation of spatial data , 1999 .

[24]  Linda H. Zhao Bayesian aspects of some nonparametric problems , 2000 .

[25]  Roger Woodard,et al.  Interpolation of Spatial Data: Some Theory for Kriging , 1999, Technometrics.

[26]  Klaus Ritter,et al.  Bayesian numerical analysis , 2000 .

[27]  J. Rosenthal,et al.  Optimal scaling for various Metropolis-Hastings algorithms , 2001 .

[28]  G. Roberts,et al.  On inference for partially observed nonlinear diffusion models using the Metropolis–Hastings algorithm , 2001 .

[29]  Jun S. Liu,et al.  Monte Carlo strategies in scientific computing , 2001 .

[30]  Arnold Neumaier,et al.  Introduction to Numerical Analysis , 2001 .

[31]  Phil Dyke,et al.  An Introduction to Laplace Transforms and Fourier Series , 2002 .

[32]  Tim Hesterberg,et al.  Monte Carlo Strategies in Scientific Computing , 2002, Technometrics.

[33]  D. Menemenlis Inverse Modeling of the Ocean and Atmosphere , 2002 .

[34]  William H. Press,et al.  Numerical recipes in C , 2002 .

[35]  J. C. Lemm Bayesian field theory , 2003 .

[36]  L. Younes,et al.  Diffeomorphic matching of distributions: a new approach for unlabelled point-sets and sub-manifolds matching , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[37]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[38]  Michael I. Miller,et al.  Group Actions, Homeomorphisms, and Matching: A General Framework , 2004, International Journal of Computer Vision.

[39]  C. Robert The Metropolis–Hastings Algorithm , 2015, 1504.01896.

[40]  A. Stuart,et al.  Conditional Path Sampling of SDEs and the Langevin MCMC Method , 2004 .

[41]  Joan Alexis Glaunès,et al.  Surface Matching via Currents , 2005, IPMI.

[42]  Christian P. Robert,et al.  Monte Carlo Statistical Methods (Springer Texts in Statistics) , 2005 .

[43]  Christopher K. Wikle,et al.  Atmospheric Modeling, Data Assimilation, and Predictability , 2005, Technometrics.

[44]  J. Voss,et al.  Analysis of SPDEs arising in path sampling. Part I: The Gaussian case , 2005 .

[45]  Leonhard Held,et al.  Gaussian Markov Random Fields: Theory and Applications , 2005 .

[46]  T. Kurtz,et al.  Stochastic equations in infinite dimensions , 2006 .

[47]  James O. Ramsay,et al.  Functional Data Analysis , 2005 .

[48]  A. Stuart,et al.  ANALYSIS OF SPDES ARISING IN PATH SAMPLING PART II: THE NONLINEAR CASE , 2006, math/0601092.

[49]  C. Cotter The variational particle-mesh method for matching curves , 2007, 0712.0241.

[50]  Susan A. Murphy,et al.  Monographs on statistics and applied probability , 1990 .

[51]  Pierre L'Ecuyer,et al.  TestU01: A C library for empirical testing of random number generators , 2006, TOMS.

[52]  Ryan P. Adams,et al.  The Gaussian Process Density Sampler , 2008, NIPS.

[53]  G. Roberts,et al.  MCMC methods for diffusion bridges , 2008 .

[54]  K. Conrad GROUP ACTIONS , 2008 .

[55]  A. Iserles A First Course in the Numerical Analysis of Differential Equations: Bluffer's guide to useful mathematics , 2008 .

[56]  A. Stuart,et al.  Computational Complexity of Metropolis-Hastings Methods in High Dimensions , 2009 .

[57]  James C. Robinson,et al.  Bayesian inverse problems for functions and applications to fluid mechanics , 2009 .

[58]  Radford M. Neal Regression and Classification Using Gaussian Process Priors , 2009 .

[59]  R. Adler The Geometry of Random Fields , 2009 .

[60]  A. P. Dawid,et al.  Regression and Classification Using Gaussian Process Priors , 2009 .

[61]  A. Stuart,et al.  MCMC methods for sampling function space , 2009 .

[62]  Gareth Roberts,et al.  Optimal scalings for local Metropolis--Hastings chains on nonproduct targets in high dimensions , 2009, 0908.0865.

[63]  Martin Hairer,et al.  Sampling conditioned diffusions , 2009 .

[64]  S. Cotter Applications of MCMC methods on function spaces , 2010 .

[65]  J. M. Sanz-Serna,et al.  Optimal tuning of the hybrid Monte Carlo algorithm , 2010, 1001.4460.

[66]  Andrew M. Stuart,et al.  Inverse problems: A Bayesian perspective , 2010, Acta Numerica.

[67]  P. Müller,et al.  Bayesian Nonparametrics: An invitation to Bayesian nonparametrics , 2010 .

[68]  N. Pillai,et al.  On the random walk metropolis algorithm for Gaussian random field priors and the gradient flow , 2011 .

[69]  A. Stuart,et al.  Besov priors for Bayesian inverse problems , 2011, 1105.0889.

[70]  J. M. Sanz-Serna,et al.  Hybrid Monte Carlo on Hilbert spaces , 2011 .

[71]  Alexandre H. Thi'ery,et al.  Optimal Scaling and Diffusion Limits for the Langevin Algorithm in High Dimensions , 2011, 1103.0542.

[72]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[73]  A. Stuart,et al.  Signal processing problems on function space: Bayesian formulation, stochastic PDEs and effective MCMC methods , 2011 .

[74]  M. Girolami,et al.  Riemann manifold Langevin and Hamiltonian Monte Carlo methods , 2011, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[75]  C. J. Cotter,et al.  Bayesian data assimilation in shape registration , 2012, 1212.5088.

[76]  A. Stuart,et al.  Variational data assimilation using targetted random walks , 2012 .

[77]  Jonathan C. Mattingly,et al.  Diffusion limits of the random walk metropolis algorithm in high dimensions , 2010, 1003.4306.

[78]  Kody J. H. Law Proposals which speed up function-space MCMC , 2014, J. Comput. Appl. Math..

[79]  A. Stuart,et al.  Spectral gaps for a Metropolis–Hastings algorithm in infinite dimensions , 2011, 1112.1392.

[80]  Maurizio Dapor Monte Carlo Strategies , 2020, Transport of Energetic Electrons in Solids.

[81]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[82]  Harry van Zanten,et al.  Reversible jump MCMC for nonparametric drift estimation for diffusion processes , 2012, Comput. Stat. Data Anal..

[83]  Duaa Hatem Mohamed Maima On the Random Walk Metropolis Algorithm , 2015 .