Diffusion limit for the random walk Metropolis algorithm out of stationarity

The Random Walk Metropolis (RWM) algorithm is a Metropolis- Hastings MCMC algorithm designed to sample from a given target distribution \pi with Lebesgue density on R^N. RWM constructs a Markov chain by randomly proposing a new position (the "proposal move"), which is then accepted or rejected according to a rule which makes the chain reversible with respect to \pi. When the dimension N is large a key question is to determine the optimal scaling with N of the proposal variance: if the proposal variance is too large, the algorithm will reject the proposed moves too often; if it is too small, the algorithm will explore the state space too slowly. Determining the optimal scaling of the proposal variance gives a measure of the cost of the algorithm as well. One approach to tackle this issue, which we adopt here, is to derive diffusion limits for the algorithm. Such an approach has been proposed in the seminal papers [RGG97, RR98]; in particular in [RGG97] the authors derive a diffusion limit for the RWM algorithm under the two following assumptions: i) the algorithm is started in stationarity; ii) the target measure $\pi$ is in product form. The present paper considers the situation of practical interest in which both assumptions i) and ii) are removed. That is a) we study the case (which occurs in practice) in which the algorithm is started out of stationarity and b) we consider target measures which are in non-product form. The target measures that we consider arise in Bayesian nonparametric statistics and in the study of conditioned diffusions. We prove that, out of stationarity, the optimal scaling for the proposal variance is O(N), as it is in stationarity. Notice that the optimal scaling in and out of stationatity need not be the same in general, and indeed they differ e.g. in the case of the MALA algorithm [KOS16].

[1]  Christian P. Robert,et al.  Introducing Monte Carlo Methods with R , 2009 .

[2]  Christian P. Robert,et al.  Introducing Monte Carlo Methods with R (Use R) , 2009 .

[3]  N. Pillai,et al.  A Function Space HMC Algorithm With Second Order Langevin Diffusion Limit , 2013, 1308.0543.

[4]  Jonathan C. Mattingly,et al.  SPDE limits of the random walk Metropolis algorithm in high dimensions , 2009 .

[5]  Andrew M. Stuart,et al.  Noisy gradient flow from a random walk in Hilbert space , 2011, 1108.1494.

[6]  A. Stuart,et al.  ANALYSIS OF SPDES ARISING IN PATH SAMPLING PART II: THE NONLINEAR CASE , 2006, math/0601092.

[7]  Alexandre H. Thi'ery,et al.  Optimal Scaling and Diffusion Limits for the Langevin Algorithm in High Dimensions , 2011, 1103.0542.

[8]  Gareth Roberts,et al.  Optimal scalings for local Metropolis--Hastings chains on nonproduct targets in high dimensions , 2009, 0908.0865.

[9]  M. Röckner,et al.  A Concise Course on Stochastic Partial Differential Equations , 2007 .

[10]  Inge S. Helland,et al.  Central Limit Theorems for Martingales with Discrete or Continuous Time , 1982 .

[11]  B. Jourdain,et al.  Optimal scaling for the transient phase of the random walk Metropolis algorithm: The mean-field limit , 2012, 1210.7639.

[12]  Andrew M. Stuart,et al.  Approximation of Bayesian Inverse Problems for PDEs , 2009, SIAM J. Numer. Anal..

[13]  J. Rosenthal,et al.  Scaling limits for the transient phase of local Metropolis–Hastings algorithms , 2005 .

[14]  N. Pillai,et al.  On the random walk metropolis algorithm for Gaussian random field priors and the gradient flow , 2011 .

[15]  L. Tierney A note on Metropolis-Hastings kernels for general state spaces , 1998 .

[16]  Juan Kuntz,et al.  Non-stationary phase of the MALA algorithm , 2016, Stochastics and Partial Differential Equations: Analysis and Computations.

[17]  Jonathan C. Mattingly,et al.  Diffusion limits of the random walk metropolis algorithm in high dimensions , 2010, 1003.4306.

[18]  J. Voss,et al.  Analysis of SPDEs arising in path sampling. Part I: The Gaussian case , 2005 .

[19]  J. Rosenthal,et al.  Optimal scaling of discrete approximations to Langevin diffusions , 1998 .

[20]  D. Jordan,et al.  Nonlinear Ordinary Differential Equations: An Introduction for Scientists and Engineers , 1979 .

[21]  B. Jourdain,et al.  Optimal scaling for the transient phase of Metropolis Hastings algorithms: The longtime behavior , 2012, 1212.5517.

[22]  E. Berger Asymptotic behaviour of a class of stochastic approximation procedures , 1986 .

[23]  A. Gelman,et al.  Weak convergence and optimal scaling of random walk Metropolis algorithms , 1997 .

[24]  Andrew M. Stuart,et al.  Inverse problems: A Bayesian perspective , 2010, Acta Numerica.