Stratification of Markov Chain Monte Carlo

We describe a simple and effective technique, the Eigenvector Method for Umbrella Sampling (EMUS), for accurately estimating small probabilities and expectations with respect to a given target probability density. In EMUS, we apply the principle of stratified survey sampling to Markov chain Monte Carlo (MCMC) simulation: We divide the support of the target distribution into regions called strata, we use MCMC to sample (in parallel) from probability distributions supported in each of the strata, and we weight the data from each stratum to assemble estimates of general averages with respect to the target distribution. We demonstrate by theoretical results and computational examples that EMUS can be dramatically more efficient than direct Markov chain Monte Carlo when the target distribution is multimodal or when the goal is to compute tail probabilities.

[1]  Ajay Jasra,et al.  Markov Chain Monte Carlo Methods and the Label Switching Problem in Bayesian Mixture Modeling , 2005 .

[2]  Nicolas Chopin,et al.  Free energy methods for Bayesian inference: efficient exploration of univariate Gaussian mixture posteriors , 2010, Statistics and Computing.

[3]  P. McCullagh,et al.  A theory of statistical models for Monte Carlo integration , 2003 .

[4]  Berend Smit,et al.  Understanding molecular simulation: from algorithms to applications , 1996 .

[5]  Hani Doss,et al.  Estimates and standard errors for ratios of normalizing constants from multiple Markov chains via regeneration , 2014, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[6]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[7]  Daniel Foreman-Mackey,et al.  emcee: The MCMC Hammer , 2012, 1202.3665.

[8]  S. Varadhan,et al.  Central limit theorem for additive functionals of reversible Markov processes and applications to simple exclusions , 1986 .

[9]  C. D. Meyer,et al.  Using the QR factorization and group inversion to compute, differentiate ,and estimate the sensitivity of stationary probabilities for markov chains , 1986 .

[10]  A. Izenman,et al.  Philatelic Mixtures and Multimodal Densities , 1988 .

[11]  Jonathan R Goodman,et al.  Ensemble samplers with affine invariance , 2010 .

[12]  D. Landau,et al.  Efficient, multiple-range random walk algorithm to calculate the density of states. , 2000, Physical review letters.

[13]  Xiao-Li Meng,et al.  SIMULATING RATIOS OF NORMALIZING CONSTANTS VIA A SIMPLE IDENTITY: A THEORETICAL EXPLORATION , 1996 .

[14]  Y. Vardi Empirical Distributions in Selection Bias Models , 1985 .

[15]  S. Andrés,et al.  Pathwise differentiability for SDEs in a convex polyhedron with oblique reflection , 2009 .

[16]  M. Bilodeau,et al.  Theory of multivariate statistics , 1999 .

[17]  C. Brooks,et al.  First-principles calculation of the folding free energy of a three-helix bundle protein. , 1995, Science.

[18]  C. Geyer Estimating Normalizing Constants and Reweighting Mixtures , 1994 .

[19]  Richard D. Gill,et al.  Large sample theory of empirical distributions in biased sampling models , 1988 .

[20]  R. Tweedie,et al.  Exponential convergence of Langevin distributions and their discrete approximations , 1996 .

[21]  Michael R. Shirts,et al.  Statistically optimal analysis of samples from multiple equilibrium states. , 2008, The Journal of chemical physics.

[22]  H. Weinberger,et al.  An optimal Poincaré inequality for convex domains , 1960 .

[23]  R. Swendsen,et al.  THE weighted histogram analysis method for free‐energy calculations on biomolecules. I. The method , 1992 .

[24]  J. Rosenthal,et al.  Geometric Ergodicity and Hybrid Markov Chains , 1997 .

[25]  Lixin Yan,et al.  Gradient estimate on convex domains and applications , 2012 .

[26]  C. D. Meyer,et al.  Comparison of perturbation bounds for the stationary distribution of a Markov chain , 2001 .

[27]  M. Qian,et al.  Mathematical Theory of Nonequilibrium Steady States: On the Frontier of Probability and Dynamical Systems , 2004 .

[28]  ERIK THIEDE,et al.  Sharp Entrywise Perturbation Bounds for Markov Chains , 2015, SIAM J. Matrix Anal. Appl..

[29]  G. Torrie,et al.  Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling , 1977 .

[30]  Scott C. Schmidler,et al.  Parallel Markov Chain Monte Carlo , 2013 .

[31]  A. Laio,et al.  Escaping free-energy minima , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[32]  B. Roux,et al.  Energetics of ion conduction through the K + channel , 2022 .

[33]  Erik H. Thiede,et al.  Eigenvector method for umbrella sampling enables error analysis. , 2016, The Journal of chemical physics.

[34]  D. Chandler,et al.  Introduction To Modern Statistical Mechanics , 1987 .

[35]  R. Bhattacharya On the functional central limit theorem and the law of the iterated logarithm for Markov processes , 1982 .

[36]  C. Geyer Markov Chain Monte Carlo Maximum Likelihood , 1991 .

[37]  M. Aitkin Likelihood and Bayesian analysis of mixtures , 2001 .

[38]  E. Vanden-Eijnden,et al.  A temperature accelerated method for sampling free energy and determining reaction pathways in rare events simulations , 2006 .

[39]  Gabriel Stoltz,et al.  Partial differential equations and stochastic methods in molecular dynamics* , 2016, Acta Numerica.

[40]  F. Nier Quantitative analysis of metastability in reversible diffusion processes via a Witten complex approach. , 2004 .