Ascent‐based Monte Carlo expectation– maximization

Summary.  The expectation–maximization (EM) algorithm is a popular tool for maximizing likelihood functions in the presence of missing data. Unfortunately, EM often requires the evaluation of analytically intractable and high dimensional integrals. The Monte Carlo EM (MCEM) algorithm is the natural extension of EM that employs Monte Carlo methods to estimate the relevant integrals. Typically, a very large Monte Carlo sample size is required to estimate these integrals within an acceptable tolerance when the algorithm is near convergence. Even if this sample size were known at the onset of implementation of MCEM, its use throughout all iterations is wasteful, especially when accurate starting values are not available. We propose a data‐driven strategy for controlling Monte Carlo resources in MCEM. The algorithm proposed improves on similar existing methods by recovering EM's ascent (i.e. likelihood increasing) property with high probability, being more robust to the effect of user‐defined inputs and handling classical Monte Carlo and Markov chain Monte Carlo methods within a common framework. Because of the first of these properties we refer to the algorithm as ‘ascent‐based MCEM’. We apply ascent‐based MCEM to a variety of examples, including one where it is used to accelerate the convergence of deterministic EM dramatically.

[1]  Maurice G. Kendall The advanced theory of statistics , 1958 .

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  T. Louis Finding the Observed Information Matrix When Using the EM Algorithm , 1982 .

[4]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[5]  E. Nummelin General irreducible Markov chains and non-negative operators: Embedded renewal processes , 1984 .

[6]  G. C. Wei,et al.  A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms , 1990 .

[7]  Boris Polyak,et al.  Acceleration of stochastic approximation by averaging , 1992 .

[8]  Richard L. Tweedie,et al.  Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.

[9]  C. Geyer On the Convergence of Monte Carlo Maximum Likelihood Calculations , 1994 .

[10]  C. McCulloch Maximum Likelihood Variance Components Estimation for Binary Data , 1994 .

[11]  Bin Yu,et al.  Regeneration in Markov chain samplers , 1995 .

[12]  R. Littell SAS System for Mixed Models , 1996 .

[13]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[14]  C. McCulloch Maximum Likelihood Algorithms for Generalized Linear Mixed Models , 1997 .

[15]  C. Geyer,et al.  Geometric Ergodicity of Gibbs and Block Gibbs Samplers for a Hierarchical Random Effects Model , 1998 .

[16]  J. Booth,et al.  Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm , 1999 .

[17]  Jun S. Liu,et al.  Monte Carlo EM with importance reweighting and its applications in random effects models 1 1 This wo , 1999 .

[18]  Kenneth Lange,et al.  Numerical analysis for statisticians , 1999 .

[19]  Hoon Kim,et al.  Monte Carlo Statistical Methods , 2000, Technometrics.

[20]  Galin L. Jones,et al.  Honest Exploration of Intractable Probability Distributions via Markov Chain Monte Carlo , 2001 .

[21]  George Casella,et al.  Implementations of the Monte Carlo EM Algorithm , 2001 .

[22]  A. Agresti,et al.  A Correlated Probit Model for Joint Modeling of Clustered Binary and Continuous Responses , 2001 .

[23]  Wolfgang Jank,et al.  A survey of Monte Carlo algorithms for maximizing the likelihood of a two-stage hierarchical model , 2001 .

[24]  Brian D. Ripley,et al.  geoRglm: A Package for Generalised Linear Spatial Models , 2002 .

[25]  Brian S. Caffo,et al.  Empirical supremum rejection sampling , 2002 .

[26]  Galin L. Jones,et al.  On the applicability of regenerative simulation in Markov chain Monte Carlo , 2002 .

[27]  Jian Qing Shi,et al.  Publication bias and meta‐analysis for 2×2 tables: an average Markov chain Monte Carlo EM algorithm , 2002 .

[28]  Wolfgang Jank,et al.  Efficiency of Monte Carlo EM and Simulated Maximum Likelihood in Two-Stage Hierarchical Models , 2003 .

[29]  Karl J. Friston,et al.  Variance Components , 2003 .

[30]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[31]  Lang Wu,et al.  Generalized linear mixed models with informative dropouts and missing covariates , 2007 .

[32]  P. Diggle,et al.  Model‐based geostatistics , 2007 .