Monte Carlo Standard Errors for Markov Chain Monte Carlo

Markov chain Monte Carlo (MCMC) is a method of producing a correlated sample to estimate characteristics of a target distribution. A fundamental question is how long should the simulation be run? One method to address this issue is to run the simulation until the width of a confidence interval for the quantity of interest is below a user-specified value. The use of this fixed-width methods requires an estimate of the Monte Carlo standard error (MCSE). This dissertation begins by discussing why MCSEs are important, how they can be easily calculated in MCMC and how they can be used to decide when to stop the simulation. The use of MCSEs is then compared to a popular alternative in the context of multiple examples. This dissertation continues by discussing the relevant Markov chain theory with particular attention paid to the conditions and definitions needed to establish a Markov chain central limit theorem. Estimating MCSEs requires estimating the associated asymptotic variance. I introduce several techniques for estimating MCSEs: batch means, overlapping batch means, regeneration, subsampling and spectral variance estimation. Asymptotic properties useful in MCMC settings are established for these variance estimators. Specifically, I established conditions under which the estimator for the asymptotic variance in a Markov chain central limit theorem is strongly consistent. Strong consistency ensures that confidence intervals formed will be asymptotically valid. In addition, I established conditions to ensure mean-square consistency for the estimators using batch means and overlapping batch means. Mean-square coniii iv sistency is useful in choosing an optimal batch size for MCMC practitioners. Several approaches have been introduced dealing with the special case of estimating ergodic averages and their corresponding standard errors. Surprisingly, very little attention has been given to characteristics of the target distribution that cannot be represented as ergodic averages. To this end, I proposed use of subsampling methods as a means for estimating the qth quantile of the posterior distribution. Finally, the finite sample properties of subsampling are examined.

[1]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[2]  Charles J. Geyer,et al.  Likelihood inference for spatial point processes , 2019, Stochastic Geometry.

[3]  C. Geyer,et al.  Geometric Ergodicity of Gibbs and Block Gibbs Samplers for a Hierarchical Random Effects Model , 1998 .

[4]  Endre Csáki,et al.  On additive functionals of Markov chains , 1995 .

[5]  Galin L. Jones,et al.  On the applicability of regenerative simulation in Markov chain Monte Carlo , 2002 .

[6]  Halim Damerdji,et al.  Strong Consistency of the Variance Estimator in Steady-State Simulation Output Analysis , 1994, Math. Oper. Res..

[7]  G. Roberts,et al.  Polynomial convergence rates of Markov chains. , 2002 .

[8]  J. Rosenthal,et al.  General state space Markov chains and MCMC algorithms , 2004, math/0404033.

[9]  Alicia A. Johnson,et al.  Gibbs Sampling for a Bayesian Hierarchical Version of the General Linear Mixed Model , 2007 .

[10]  Péter Major,et al.  The approximation of partial sums of independent RV's , 1976 .

[11]  Jun S. Liu,et al.  Monte Carlo strategies in scientific computing , 2001 .

[12]  K. Athreya,et al.  ON THE CONVERGENCE OF THE MARKOV CHAIN SIMULATION METHOD , 1996 .

[13]  E. Zeidler Nonlinear functional analysis and its applications , 1988 .

[14]  Ronald L. Wasserstein,et al.  Monte Carlo: Concepts, Algorithms, and Applications , 1997 .

[15]  David C. Hoaglin,et al.  The Reporting of Computation-Based Results in Statistics , 1975 .

[16]  Galin L. Jones,et al.  Fixed-Width Output Analysis for Markov Chain Monte Carlo , 2006, math/0601446.

[17]  Xiao-Li Meng,et al.  The Art of Data Augmentation , 2001 .

[18]  Anja Vogler,et al.  An Introduction to Multivariate Statistical Analysis , 2004 .

[19]  David Goldsman,et al.  Large-Sample Results for Batch Means , 1997 .

[20]  Ming-Hui Chen,et al.  Propriety of posterior distribution for dichotomous quantal response models , 2000 .

[21]  P. Bühlmann Bootstraps for Time Series , 2002 .

[22]  W. McCormick,et al.  Regeneration-based bootstrap for Markov chains , 1993 .

[23]  J. Rosenthal,et al.  Possible biases induced by MCMC convergence diagnostics , 1999 .

[24]  M. Haran,et al.  Estimating the Risk of a Crop Epidemic From Coincident Spatio-temporal Processes , 2010 .

[25]  Maurice G. Kendall,et al.  The advanced theory of statistics , 1945 .

[26]  W. Michael Conklin,et al.  Monte Carlo Methods in Bayesian Computation , 2001, Technometrics.

[27]  Richard L. Tweedie,et al.  Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.

[28]  Paul Bratley,et al.  A guide to simulation , 1983 .

[29]  J. Hobert,et al.  Convergence rates and asymptotic standard errors for Markov chain Monte Carlo algorithms for Bayesian probit regression , 2007 .

[30]  E. Nummelin General irreducible Markov chains and non-negative operators: Preface , 1984 .

[31]  C. Geyer,et al.  Annealing Markov chain Monte Carlo with applications to ancestral inference , 1995 .

[32]  Jeffrey S. Rosenthal,et al.  Analysis of the Gibbs Sampler for a Model Related to James-stein Estimators , 2007 .

[33]  Galin L. Jones,et al.  Honest Exploration of Intractable Probability Distributions via Markov Chain Monte Carlo , 2001 .

[34]  Charles J. Geyer,et al.  Practical Markov Chain Monte Carlo , 1992 .

[35]  R. Douc,et al.  Practical drift conditions for subgeometric rates of convergence , 2004, math/0407122.

[36]  Bruce W. Schmeiser,et al.  Overlapping batch means: something for nothing? , 1984, WSC '84.

[37]  Ward Whitt,et al.  Estimating the asymptotic variance with batch means , 1991, Oper. Res. Lett..

[38]  E. Zeidler Nonlinear Functional Analysis and Its Applications: II/ A: Linear Monotone Operators , 1989 .

[39]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[40]  Sylvia Richardson,et al.  Markov chain concepts related to sampling algorithms , 1995 .

[41]  Donald L. Iglehart,et al.  Simulation Output Analysis Using Standardized Time Series , 1990, Math. Oper. Res..

[42]  Andrew Gelman,et al.  General methods for monitoring convergence of iterative simulations , 1998 .

[43]  P. Bertail,et al.  Regenerative block bootstrap for Markov chains , 2006 .

[44]  E. Nummelin,et al.  Geometric ergodicity of Harris recurrent Marcov chains with applications to renewal theory , 1982 .

[45]  É. Moulines,et al.  V-Subgeometric ergodicity for a Hastings–Metropolis algorithm , 2000 .

[46]  James M. Flegal,et al.  Batch means and spectral variance estimators in Markov chain Monte Carlo , 2008, 0811.1729.

[47]  B. Carlin,et al.  Markov Chain Monte Carlo conver-gence diagnostics: a comparative review , 1996 .

[48]  Bin Yu,et al.  Regeneration in Markov chain samplers , 1995 .

[49]  J. Møller,et al.  Geometric Ergodicity of Metropolis-Hastings Algorithms for Conditional Simulation in Generalized Linear Mixed Models , 2001 .

[50]  Bounds on regeneration times and convergence rates for Markov chainsfn1fn1Work supported in part by NSF Grant DMS 9504561 and EPSRC grant GR/J19900. , 1999 .

[51]  Murali Haran,et al.  Markov chain Monte Carlo: Can we trust the third significant figure? , 2007, math/0703746.

[52]  R. Tweedie,et al.  Rates of convergence of the Hastings and Metropolis algorithms , 1996 .

[53]  J. Rosenthal Minorization Conditions and Convergence Rates for Markov Chain Monte Carlo , 1995 .

[54]  Galin L. Jones,et al.  Sufficient burn-in for Gibbs samplers for a hierarchical random effects model , 2004, math/0406454.

[55]  Bradley P Carlin,et al.  spBayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models. , 2007, Journal of statistical software.

[56]  Nicholas G. Polson,et al.  On the Geometric Convergence of the Gibbs Sampler , 1994 .

[57]  É. Moulines,et al.  Polynomial ergodicity of Markov transition kernels , 2003 .

[58]  Eberhard Zeidler,et al.  Nonlinear monotone operators , 1990 .

[59]  Peter D. Welch,et al.  On the relationship between batch means, overlapping means and spectral estimation , 1987, WSC '87.

[60]  J. Rosenthal,et al.  Convergence of Slice Sampler Markov Chains , 1999 .

[61]  Linus Schrage,et al.  A guide to simulation , 1983 .

[62]  T. W. Anderson,et al.  Statistical analysis of time series , 1972 .

[63]  W. Philipp,et al.  Almost sure invariance principles for partial sums of weakly dependent random variables , 1975 .

[64]  D. B. Preston Spectral Analysis and Time Series , 1983 .

[65]  Pascal Massart,et al.  The functional central limit theorem for strongly mixing processes , 1994 .

[66]  J. Rosenthal,et al.  Geometric Ergodicity and Hybrid Markov Chains , 1997 .

[67]  David Goldsman,et al.  To batch or not to batch? , 2004, TOMC.

[68]  H. Damerdji,et al.  Strong consistency and other properties of the spectral variance estimator , 1991 .

[69]  Pierre L'Ecuyer,et al.  An Object-Oriented Random-Number Package with Many Long Streams and Substreams , 2002, Oper. Res..

[70]  D. Politis The Impact of Bootstrap Methods on Time Series Analysis , 2003 .

[71]  L. Tierney Markov Chains for Exploring Posterior Distributions , 1994 .

[72]  A Few Remarks on “Fixed-Width Output Analysis for Markov Chain Monte Carlo” by Jones et al , 2007 .

[73]  B. Schmeiser,et al.  Optimal mean-squared-error batch sizes , 1995 .

[74]  Halim Damerdji,et al.  Mean-Square Consistency of the Variance Estimator in Steady-State Simulation Output Analysis , 1995, Oper. Res..

[75]  Galin L. Jones On the Markov chain central limit theorem , 2004, math/0409112.

[76]  Anthony C. Davison,et al.  Bootstrap Methods and Their Application , 1998 .

[77]  N. Bingham INDEPENDENT AND STATIONARY SEQUENCES OF RANDOM VARIABLES , 1973 .

[78]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[79]  P. Major,et al.  An approximation of partial sums of independent RV's, and the sample DF. II , 1975 .

[80]  J. Hobert,et al.  Geometric Ergodicity of van Dyk and Meng's Algorithm for the Multivariate Student's t Model , 2004 .

[81]  P. Glynn,et al.  The Asymptotic Validity of Sequential Stopping Rules for Stochastic Simulations , 1992 .

[82]  L. Tierney,et al.  Efficiency and Convergence Properties of Slice Samplers , 2002 .

[83]  S. F. Jarner,et al.  Geometric ergodicity of Metropolis algorithms , 2000 .

[84]  Arnold J Stromberg,et al.  Subsampling , 2001, Technometrics.

[85]  C. Robert Convergence Control Methods for Markov Chain Monte Carlo Algorithms , 1995 .

[86]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[87]  Jun S. Liu,et al.  Parameter Expansion for Data Augmentation , 1999 .

[88]  Edward Carlstein,et al.  Asymptotic Normality for a General Statistic from a Stationary Sequence , 1986 .

[89]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[90]  S. Meyn,et al.  Computable Bounds for Geometric Convergence Rates of Markov Chains , 1994 .

[91]  G. Parisi,et al.  Simulated tempering: a new Monte Carlo scheme , 1992, hep-lat/9205018.