Accelerated Dimension-Independent Adaptive Metropolis

This work describes improvements by algorithmic and architectural means to black-box Bayesian inference over high-dimensional parameter spaces. The well-known adaptive Metropolis (AM) algorithm [H. Haario, E. Saksman, and J. Tamminen, Bernoulli, (2001), pp. 223--242] is extended herein to scale asymptotically uniformly with respect to the underlying parameter dimension for Gaussian targets, by respecting the variance of the target. The resulting algorithm, referred to as the dimension-independent adaptive Metropolis (DIAM) algorithm, also shows improved performance with respect to adaptive Metropolis on non-Gaussian targets. This algorithm is further improved, and the possibility of probing high-dimensional (with dimension $d \geq 1000$) targets is enabled, via GPU-accelerated numerical libraries and periodically synchronized concurrent chains (justified a posteriori). Asymptotically in dimension, this GPU implementation exhibits a factor of four improvement versus a competitive CPU-based Intel MKL (math ...

[1]  G. Fort,et al.  Convergence of adaptive and interacting Markov chain Monte Carlo algorithms , 2011, 1203.3036.

[2]  Hoon Kim,et al.  Monte Carlo Statistical Methods , 2000, Technometrics.

[3]  C. Geyer Markov Chain Monte Carlo Maximum Likelihood , 1991 .

[4]  G. Roberts,et al.  Optimal scalings of Metropolis-Hastings algorithms for non-product targets in high dimensions , 2009 .

[5]  Franck Cappello,et al.  The International Exascale Software Project: a Call To Cooperative Action By the Global High-Performance Community , 2009, Int. J. High Perform. Comput. Appl..

[6]  S. Shreve,et al.  Stochastic differential equations , 1955, Mathematical Proceedings of the Cambridge Philosophical Society.

[7]  Richard L. Tweedie,et al.  Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.

[8]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[9]  Jack Dongarra,et al.  QUARK Users' Guide: QUeueing And Runtime for Kernels , 2011 .

[10]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[11]  Christophe Andrieu,et al.  A tutorial on adaptive MCMC , 2008, Stat. Comput..

[12]  J. Rosenthal,et al.  Optimal scaling for various Metropolis-Hastings algorithms , 2001 .

[13]  Chao Yang,et al.  Learn From Thy Neighbor: Parallel-Chain and Regional Adaptive MCMC , 2009 .

[14]  Kody J. H. Law Proposals which speed up function-space MCMC , 2014, J. Comput. Appl. Math..

[15]  Radu V. Craiu,et al.  Multiprocess parallel antithetic coupling for backward and forward Markov Chain Monte Carlo , 2005, math/0505631.

[16]  P. Priouret,et al.  A central limit theorem for adaptive and interacting Markov chains , 2011, 1107.2574.

[17]  Jonathan C. Mattingly,et al.  SPDE limits of the random walk Metropolis algorithm in high dimensions , 2009 .

[18]  T. Faniran Numerical Solution of Stochastic Differential Equations , 2015 .

[19]  B. Matérn Spatial variation : Stochastic models and their application to some problems in forest surveys and other sampling investigations , 1960 .

[20]  C. Andrieu,et al.  On the ergodicity properties of some adaptive MCMC algorithms , 2006, math/0610317.

[21]  Edward I. George,et al.  Bayes and big data: the consensus Monte Carlo algorithm , 2016, Big Data and Information Theory.

[22]  Christian P. Robert,et al.  Monte Carlo Statistical Methods (Springer Texts in Statistics) , 2005 .

[23]  Corporate The MPI Forum,et al.  MPI: a message passing interface , 1993, Supercomputing '93.

[24]  J. D. Doll,et al.  Brownian dynamics as smart Monte Carlo simulation , 1978 .

[25]  David B. Dunson,et al.  Robust and Scalable Bayes via a Median of Subset Posterior Measures , 2014, J. Mach. Learn. Res..

[26]  W. Gilks Markov Chain Monte Carlo , 2005 .

[27]  A. Gelman,et al.  Weak convergence and optimal scaling of random walk Metropolis algorithms , 1997 .

[28]  G. Roberts,et al.  MCMC Methods for Functions: ModifyingOld Algorithms to Make Them Faster , 2012, 1202.0709.

[29]  Jack Dongarra,et al.  LINPACK Users' Guide , 1987 .

[30]  Ben Calderhead,et al.  A general construction for parallelizing Metropolis−Hastings algorithms , 2014, Proceedings of the National Academy of Sciences.

[31]  Ingvar Strid Efficient parallelisation of Metropolis-Hastings algorithms using a prefetching approach , 2010, Comput. Stat. Data Anal..

[32]  Matti Vihola,et al.  Robust adaptive Metropolis algorithm with coerced acceptance rate , 2010, Statistics and Computing.

[33]  M. Girolami,et al.  Solving large-scale PDE-constrained Bayesian inverse problems with Riemann manifold Hamiltonian Monte Carlo , 2014, 1407.1517.

[34]  P. Moral,et al.  Sequential Monte Carlo samplers , 2002, cond-mat/0212648.

[35]  Max Welling,et al.  Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget , 2013, ICML 2014.

[36]  Arnaud Doucet,et al.  On the Utility of Graphics Cards to Perform Massively Parallel Simulation of Advanced Monte Carlo Methods , 2009, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[37]  K. Hukushima,et al.  Exchange Monte Carlo Method and Application to Spin Glass Simulations , 1995, cond-mat/9512035.

[38]  David E. Keyes,et al.  KBLAS: An Optimized Library for Dense Matrix-Vector Multiplication on GPU Accelerators , 2014, ACM Trans. Math. Softw..

[39]  Alexandre H. Thi'ery,et al.  Optimal Scaling and Diffusion Limits for the Langevin Algorithm in High Dimensions , 2011, 1103.0542.

[40]  James Martin,et al.  A Stochastic Newton MCMC Method for Large-Scale Statistical Inverse Problems with Application to Seismic Inversion , 2012, SIAM J. Sci. Comput..

[41]  Y. Marzouk,et al.  Large-Scale Inverse Problems and Quantification of Uncertainty , 1994 .

[42]  S. Duane,et al.  Hybrid Monte Carlo , 1987 .

[43]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[44]  Heikki Haario,et al.  DRAM: Efficient adaptive MCMC , 2006, Stat. Comput..

[45]  Christian P. Robert,et al.  Using Parallel Computation to Improve Independent Metropolis–Hastings Based Estimation , 2010, ArXiv.

[46]  L. Tierney A note on Metropolis-Hastings kernels for general state spaces , 1998 .

[47]  Claudia Schillings,et al.  Scaling Limits in Computational Bayesian Inversion , 2014 .

[48]  Cliburn Chan,et al.  Understanding GPU Programming for Statistical Computation: Studies in Massively Parallel Massive Mixtures , 2010, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[49]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[50]  A. V. D. Vaart,et al.  Convergence rates of posterior distributions , 2000 .

[51]  E. Somersalo,et al.  Statistical inversion and Monte Carlo sampling methods in electrical impedance tomography , 2000 .

[52]  G. Roberts,et al.  MCMC methods for diffusion bridges , 2008 .

[53]  D. J. Farlie,et al.  Prediction and Regulation by Linear Least-Square Methods , 1964 .

[54]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[55]  J. Rosenthal,et al.  Optimal scaling of discrete approximations to Langevin diffusions , 1998 .

[56]  Heikki Haario,et al.  Componentwise adaptation for high dimensional MCMC , 2005, Comput. Stat..

[57]  Albert Tarantola,et al.  Inverse problem theory - and methods for model parameter estimation , 2004 .

[58]  V. Bogachev Gaussian Measures on a , 2022 .

[59]  E. Saksman,et al.  On the ergodicity of the adaptive Metropolis algorithm on unbounded domains , 2008, 0806.2933.

[60]  Andrew Gelman,et al.  General methods for monitoring convergence of iterative simulations , 1998 .

[61]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[62]  Tiangang Cui,et al.  Dimension-independent likelihood-informed MCMC , 2014, J. Comput. Phys..

[63]  Darren J. Wilkinson,et al.  Parallel Bayesian Computation , 2005 .

[64]  Bart G. van Bloemen Waanders,et al.  Fast Algorithms for Bayesian Uncertainty Quantification in Large-Scale Linear Inverse Problems Based on Low-Rank Partial Hessian Approximations , 2011, SIAM J. Sci. Comput..

[65]  Sebastian J. Vollmer,et al.  Dimension-Independent MCMC Sampling for Inverse Problems with Non-Gaussian Priors , 2013, SIAM/ASA J. Uncertain. Quantification.

[66]  J. M. Sanz-Serna,et al.  Optimal tuning of the hybrid Monte Carlo algorithm , 2010, 1001.4460.

[67]  J. Rosenthal,et al.  Coupling and Ergodicity of Adaptive Markov Chain Monte Carlo Algorithms , 2007, Journal of Applied Probability.

[68]  Heikki Haario,et al.  Efficient MCMC for Climate Model Parameter Estimation: Parallel Adaptive Chains and Early Rejection , 2012 .

[69]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[70]  Ryan P. Adams,et al.  Firefly Monte Carlo: Exact MCMC with Subsets of Data , 2014, UAI.

[71]  H. Haario,et al.  An adaptive Metropolis algorithm , 2001 .

[72]  R. Tweedie,et al.  Exponential convergence of Langevin distributions and their discrete approximations , 1996 .

[73]  Tiangang Cui,et al.  Likelihood-informed dimension reduction for nonlinear inverse problems , 2014, 1403.4680.

[74]  O. L. Maître,et al.  Spectral Methods for Uncertainty Quantification: With Applications to Computational Fluid Dynamics , 2010 .

[75]  M. Girolami,et al.  Riemann manifold Langevin and Hamiltonian Monte Carlo methods , 2011, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[76]  C. Schwab,et al.  Scaling limits in computational Bayesian inversion , 2014 .