Variational Inference for Generalized Linear Mixed Models Using Partially Noncentered Parametrizations

The effects of different parametrizations on the convergence of Bayesian computational algorithms for hierarchical models are well explored. Techniques such as centering, noncentering and partial noncentering can be used to accelerate convergence in MCMC and EM algorithms but are still not well studied for variational Bayes (VB) methods. As a fast deterministic approach to posterior approximation, VB is attracting increasing interest due to its suitability for large high-dimensional data. Use of different parametrizations for VB has not only computational but also statistical implications, as different parametrizations are associated with different factorized posterior approximations. We examine the use of partially noncentered parametrizations in VB for generalized linear mixed models (GLMMs). Our paper makes four contributions. First, we show how to implement an algorithm called nonconjugate variational message passing for GLMMs. Second, we show that the partially noncentered parametrization can adapt to the quantity of information in the data and determine a parametrization close to optimal. Third, we show that partial noncentering can accelerate convergence and produce more accurate posterior approximations than centering or noncentering. Finally, we demonstrate how the variational lower bound, produced as part of the computation, can be useful for model selection.

[1]  D. Nott,et al.  Variational Approximation for Mixtures of Linear Mixed Models , 2011, 1112.4675.

[2]  Matthew P. Wand,et al.  Fully simplified multivariate normal updates in non-conjugate variational message passing , 2014, J. Mach. Learn. Res..

[3]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[4]  Kelvin K. W. Yau,et al.  Conditional Akaike information criterion for generalized linear mixed models , 2012, Comput. Stat. Data Anal..

[5]  M. Wand,et al.  Gaussian Variational Approximate Inference for Generalized Linear Mixed Models , 2012 .

[6]  Tom Minka,et al.  Non-conjugate Variational Message Passing for Multinomial and Binary Regression , 2011, NIPS.

[7]  L. Held,et al.  Sensitivity analysis in Bayesian generalized linear mixed models for binary data , 2011 .

[8]  Yaming Yu,et al.  To Center or Not to Center: That Is Not the Question—An Ancillarity–Sufficiency Interweaving Strategy (ASIS) for Boosting MCMC Efficiency , 2011 .

[9]  Jonathan J. Forster,et al.  Default Bayesian model determination methods for generalised linear mixed models , 2010, Comput. Stat. Data Anal..

[10]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[11]  M. Wand,et al.  Explaining Variational Approximations , 2010 .

[12]  Jon D. McAuliffe,et al.  Variational Inference for Large-Scale Models of Discrete Choice , 2007, 0712.2526.

[13]  Patrick Brown,et al.  MCMC for Generalized Linear Mixed Models with glmmBUGS , 2010, R J..

[14]  Aaron Christ,et al.  Mixed Effects Models and Extensions in Ecology with R , 2009 .

[15]  F. Rijmen,et al.  Assessing the performance of variational methods for mixed logistic regression models , 2008 .

[16]  Alexandre Roulin,et al.  Nestling barn owls beg more intensely in the presence of their mother than in the presence of their father , 2007, Animal Behaviour.

[17]  Maengseok Noh,et al.  REML estimation for binary data in GLMMs , 2007 .

[18]  Gareth O. Roberts,et al.  A General Framework for the Parametrization of Hierarchical Models , 2007, 0708.3797.

[19]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[20]  Yuan Qi,et al.  Parameter Expanded Variational Bayesian Methods , 2006, NIPS.

[21]  Robert E. Kass,et al.  A default conjugate prior for variance components in generalized linear mixed models (comment on article by Browne and Draper) , 2006 .

[22]  D. Dunson,et al.  Bayesian Covariance Selection in Generalized Linear Mixed Models , 2006, Biometrics.

[23]  Gareth O. Roberts,et al.  Robust Markov chain Monte Carlo Methods for Spatial Generalized Linear Mixed Models , 2006 .

[24]  M. Wand,et al.  General design Bayesian generalized linear mixed models , 2006, math/0606491.

[25]  D. Ankerst,et al.  Kendall's Advanced Theory of Statistics, Vol. 2B: Bayesian Inference , 2005 .

[26]  Charles M. Bishop,et al.  Variational Message Passing , 2005, J. Mach. Learn. Res..

[27]  Andrew Gelman,et al.  R2WinBUGS: A Package for Running WinBUGS from R , 2005 .

[28]  William J. Browne,et al.  Bayesian and likelihood-based methods in multilevel modeling 1 A comparison of Bayesian and likelihood-based methods for fitting multilevel models , 2006 .

[29]  J. Ware,et al.  Applied Longitudinal Analysis , 2004 .

[30]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[31]  Brian D. Ripley,et al.  Modern applied statistics with S, 4th Edition , 2002, Statistics and computing.

[32]  Adrian Corduneanu,et al.  Variational Bayesian Model Selection for Mixture Distributions , 2001 .

[33]  Andrew Thomas,et al.  WinBUGS - A Bayesian modelling framework: Concepts, structure, and extensibility , 2000, Stat. Comput..

[34]  S. Raudenbush,et al.  Maximum Likelihood for Generalized Linear Models with Nested Random Effects via High-Order, Multivariate Laplace Approximation , 2000 .

[35]  Michael I. Jordan,et al.  Bayesian parameter estimation via variational methods , 2000, Stat. Comput..

[36]  Zoubin Ghahramani,et al.  Propagation Algorithms for Variational Bayesian Learning , 2000, NIPS.

[37]  Jun S. Liu,et al.  Parameter Expansion for Data Augmentation , 1999 .

[38]  Hagai Attias,et al.  A Variational Bayesian Framework for Graphical Models , 1999 .

[39]  Hagai Attias,et al.  Inferring Parameters and Structure of Latent Variable Models by Variational Bayes , 1999, UAI.

[40]  Xiao-Li Meng,et al.  Seeking efficient data augmentation schemes via conditional and marginal augmentation , 1999 .

[41]  Michael I. Jordan,et al.  A Mean Field Learning Algorithm for Unsupervised Neural Networks , 1999, Learning in Graphical Models.

[42]  M. De Backer,et al.  Twelve weeks of continuous oral therapy for toenail onychomycosis caused by dermatophytes: a double-blind comparative trial of terbinafine 250 mg/day versus itraconazole 200 mg/day. , 1998, Journal of American Academy of Dermatology.

[43]  Xiao-Li Meng,et al.  The EM Algorithm—an Old Folk‐song Sung to a Fast New Tune , 1997 .

[44]  A. O'Hagan,et al.  Kendall's Advanced Theory of Statistics, Vol. 2b: Bayesian Inference. , 1996 .

[45]  A. Gelfand,et al.  Efficient parametrizations for generalized linear mixed models, (with discussion). , 1996 .

[46]  Qing Liu,et al.  A note on Gauss—Hermite quadrature , 1994 .

[47]  Xiao-Li Meng,et al.  On the rate of convergence of the ECM algorithm , 1994 .

[48]  N. Breslow,et al.  Approximate inference in generalized linear mixed models , 1993 .

[49]  N. Laird,et al.  A likelihood-based method for analysing longitudinal binary responses , 1993 .

[50]  P. Thall,et al.  Some covariance models for longitudinal count data with overdispersion. , 1990, Biometrics.