Imputation Maximization Stochastic Approximation with Application to Generalized Linear Mixed Models

Generalized linear mixed models are useful in studying hierarchical data with possibly non-Gaussian responses. However, the intractability of likelihood functions poses challenges for estimation. We develop a new method suitable for this problem, called imputation maximization stochastic approximation (IMSA). For each iteration, IMSA first imputes latent variables/random effects, then maximizes over the complete data likelihood, and finally moves the estimate towards the new maximizer while preserving a proportion of the previous value. The limiting point of IMSA satisfies a self-consistency property and can be less biased in finite samples than the maximum likelihood estimator solved by score-equation based stochastic approximation (ScoreSA). Numerically, IMSA can also be advantageous over ScoreSA in achieving more stable convergence and respecting the parameter ranges under various transformations such as nonnegative variance components. This is corroborated through our simulation studies where IMSA consistently outperforms ScoreSA.

[1]  M. Girolami,et al.  Riemann manifold Langevin and Hamiltonian Monte Carlo methods , 2011, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[2]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[3]  R. Tweedie,et al.  Exponential convergence of Langevin distributions and their discrete approximations , 1996 .

[4]  Michael I. Miller,et al.  REPRESENTATIONS OF KNOWLEDGE IN COMPLEX SYSTEMS , 1994 .

[5]  B. Welford Note on a Method for Calculating Corrected Sums of Squares and Products , 1962 .

[6]  L. Jeff Hong,et al.  A simulation-based estimation method for bias reduction , 2018 .

[7]  Mark Von Tress,et al.  Generalized, Linear, and Mixed Models , 2003, Technometrics.

[8]  T. Louis Finding the Observed Information Matrix When Using the EM Algorithm , 1982 .

[9]  H. Robbins A Stochastic Approximation Method , 1951 .

[10]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[11]  Self Consistency: A General Recipe for Wavelet Estimation With Irregularly-spaced and/or Incomplete Data , 2007, math/0701196.

[12]  C. McCulloch Maximum Likelihood Algorithms for Generalized Linear Mixed Models , 1997 .

[13]  Dani Gamerman,et al.  Sampling from the posterior distribution in generalized linear mixed models , 1997, Stat. Comput..

[14]  J. Booth,et al.  Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm , 1999 .

[15]  S. Nielsen The stochastic EM algorithm: estimation and asymptotic results , 2000 .

[16]  Joseph Hilbe,et al.  Data Analysis Using Regression and Multilevel/Hierarchical Models , 2009 .

[17]  Yongdai Kim,et al.  The semi-parametric Bernstein-von Mises theorem for regression models with symmetric errors , 2016, Statistica Sinica.

[18]  G. C. Wei,et al.  A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms , 1990 .

[19]  Gersende Fort,et al.  Convergence of the Monte Carlo expectation maximization for curved exponential families , 2003 .

[20]  Jiming Jiang The subset argument and consistency of MLE in GLMM: Answer to an open problem and beyond , 2013, 1303.2874.

[21]  R. Schall Estimation in generalized linear models with random effects , 1991 .

[22]  Han-Fu Chen Stochastic approximation and its applications , 2002 .

[23]  J. Nelder,et al.  Generalized Linear Models with Random Effects: Unified Analysis via H-likelihood , 2006 .

[24]  N. Breslow,et al.  Approximate inference in generalized linear mixed models , 1993 .

[25]  A. Gelman,et al.  Using Redundant Parameterizations to Fit Hierarchical Models , 2008 .

[26]  Sik-Yum Lee,et al.  Analysis of generalized linear mixed models via a stochastic approximation algorithm with Markov chain Monte-Carlo method , 2002, Stat. Comput..

[27]  F. Kong,et al.  A stochastic approximation algorithm with Markov chain Monte-carlo method for incomplete data estimation problems. , 1998, Proceedings of the National Academy of Sciences of the United States of America.