Joint Selection in Mixed Models using Regularized PQL

ABSTRACT The application of generalized linear mixed models presents some major challenges for both estimation, due to the intractable marginal likelihood, and model selection, as we usually want to jointly select over both fixed and random effects. We propose to overcome these challenges by combining penalized quasi-likelihood (PQL) estimation with sparsity inducing penalties on the fixed and random coefficients. The resulting approach, referred to as regularized PQL, is a computationally efficient method for performing joint selection in mixed models. A key aspect of regularized PQL involves the use of a group based penalty for the random effects: sparsity is induced such that all the coefficients for a random effect are shrunk to zero simultaneously, which in turn leads to the random effect being removed from the model. Despite being a quasi-likelihood approach, we show that regularized PQL is selection consistent, that is, it asymptotically selects the true set of fixed and random effects, in the setting where the cluster size grows with the number of clusters. Furthermore, we propose an information criterion for choosing the single tuning parameter and show that it facilitates selection consistency. Simulations demonstrate regularized PQL outperforms several currently employed methods for joint selection even if the cluster size is small compared to the number of clusters, while also offering dramatic reductions in computation time. Supplementary materials for this article are available online.

[1]  Indranil Ghosh,et al.  The Transmuted Marshall-Olkin Fr\'{e}chet Distribution: Properties and Applications , 2015 .

[2]  H. Bondell,et al.  Joint Variable Selection for Fixed and Random Effects in Linear Mixed‐Effects Models , 2010, Biometrics.

[3]  E. L. Lehmann,et al.  Theory of point estimation , 1950 .

[4]  N. Breslow,et al.  Approximate inference in generalized linear mixed models , 1993 .

[5]  L. Tierney,et al.  Accurate Approximations for Posterior Moments and Marginal Densities , 1986 .

[6]  Chao Huang,et al.  Random effects selection in generalized linear mixed models via shrinkage penalty function , 2013, Statistics and Computing.

[7]  H. White Maximum Likelihood Estimation of Misspecified Models , 1982 .

[8]  S. Rabe-Hesketh,et al.  Reliable Estimation of Generalized Linear Mixed Models using Adaptive Quadrature , 2002 .

[9]  Runze Li,et al.  Regularization Parameter Selections via Generalized Information Criterion , 2010, Journal of the American Statistical Association.

[10]  S. Müller,et al.  Model Selection in Linear Mixed Models , 2013, 1306.2427.

[11]  Raymond J. Carroll,et al.  Identification of important regressor groups, subgroups and individuals via regularization methods: application to gut microbiome data , 2014, Bioinform..

[12]  Subhash R. Lele,et al.  Estimability and Likelihood Inference for Generalized Linear Mixed Models Using Data Cloning , 2010 .

[13]  David I. Warton,et al.  Tuning Parameter Selection for the Adaptive Lasso Using ERIC , 2015 .

[14]  T. Kneib,et al.  BayesX: Analyzing Bayesian Structural Additive Regression Models , 2005 .

[15]  Tom A. B. Snijders,et al.  Fixed and random effects. , 2005 .

[16]  Jiming Jiang,et al.  MAXIMUM POSTERIOR ESTIMATION OF RANDOM EFFECTS IN GENERALIZED LINEAR MIXED MODELS , 2001 .

[17]  Eugene Demidenko Mixed Models: Theory and Applications (Wiley Series in Probability and Statistics) , 2004 .

[18]  Jaesik Choi,et al.  Best Predictive Generalized Linear Mixed Model with Predictive Lasso for High-Speed Network Data Analysis , 2015 .

[19]  J. Ibrahim,et al.  Model Selection Criteria for Missing-Data Problems Using the EM Algorithm , 2008, Journal of the American Statistical Association.

[20]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[21]  N. Breslow,et al.  Bias Correction in Generalized Linear Mixed Models with Multiple Components of Dispersion , 1996 .

[22]  Samuel Mueller,et al.  Hierarchical selection of fixed and random effects in generalized linear mixed models , 2017 .

[23]  J. Ibrahim,et al.  Fixed and Random Effects Selection in Mixed Effects Models , 2011, Biometrics.

[24]  J. S. Rao,et al.  Fence methods for mixed model selection , 2008, 0808.0985.

[25]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[26]  Eugene Demidenko,et al.  Mixed Models: Theory and Applications with R , 2013 .

[27]  Bingqing Lin,et al.  Fixed and Random Effects Selection by REML and Pathwise Coordinate Optimization , 2013, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[28]  Runze Li,et al.  VARIABLE SELECTION IN LINEAR MIXED EFFECTS MODELS. , 2012, Annals of statistics.

[29]  Jürg Schelldorfer,et al.  GLMMLasso: An Algorithm for High-Dimensional Generalized Linear Mixed Models Using ℓ1-Penalization , 2011, 1109.4003.

[30]  Jianqing Fan,et al.  A Selective Overview of Variable Selection in High Dimensional Feature Space. , 2009, Statistica Sinica.

[31]  F. Vaida,et al.  Conditional Akaike information for mixed-effects models , 2005 .

[32]  C. McCulloch Maximum Likelihood Algorithms for Generalized Linear Mixed Models , 1997 .

[33]  D. Bates,et al.  Linear Mixed-Effects Models using 'Eigen' and S4 , 2015 .

[34]  Gerhard Tutz,et al.  Variable selection for generalized linear mixed models by L1-penalized estimation , 2012, Statistics and Computing.

[35]  Ying Lu,et al.  Model selection in linear mixed effect models , 2012, J. Multivar. Anal..

[36]  Gerhard Tutz,et al.  Variable Selection and Model Choice in Geoadditive Regression Models , 2009, Biometrics.

[37]  J. Shao AN ASYMPTOTIC THEORY FOR LINEAR MODEL SELECTION , 1997 .

[38]  A. Welsh,et al.  ROBUST MODEL SELECTION IN GENERALIZED LINEAR MODELS , 2007, 0711.2349.

[39]  Dibyen Majumdar,et al.  Conditional Second-Order Generalized Estimating Equations for Generalized Linear and Nonlinear Mixed-Effects Models , 2002 .