Specification of random effects in multilevel models: a review

The analysis of highly structured data requires models with unobserved components (random effects) able to adequately account for the patterns of variances and correlations. The specification of the unobserved components is a key and challenging task. In this paper, we first review the literature about the consequences of misspecifying the distribution of the random effects and the related diagnostic tools; we then outline the main alternatives and generalizations, also considering some issues arising in Bayesian inference. The relevance of suitably structuring the unobserved components is illustrated by means of an application exploiting a model with heteroscedastic random effects.

[1]  Ying Nian Wu,et al.  Efficient Algorithms for Robust Estimation in Linear Mixed-Effects Models Using the Multivariate t Distribution , 2001 .

[2]  P. Hall,et al.  Inference in components of variance models with low replication , 2003 .

[3]  Fabienne Comte,et al.  Nonparametric estimation of random-effects densities in linear mixed-effects model , 2012 .

[4]  D J Spiegelhalter,et al.  Flexible random‐effects models using Bayesian semi‐parametric models: applications to institutional comparisons , 2007, Statistics in medicine.

[5]  E. Lesaffre,et al.  Smooth Random Effects Distribution in a Linear Mixed Model , 2004, Biometrics.

[6]  J G Ibrahim,et al.  A semi-parametric Bayesian approach to generalized linear mixed models. , 1998, Statistics in medicine.

[7]  D. Hedeker,et al.  An Application of a Mixed‐Effects Location Scale Model for Analysis of Ecological Momentary Assessment (EMA) Data , 2008, Biometrics.

[8]  Rejoinder to “A Note on Type II Error Under Random Effects Misspecification in Generalized Linear Mixed Models” , 2010 .

[9]  U. Böckenholt,et al.  Regressor and random‐effects dependencies in multilevel models , 2004 .

[10]  G. Verbeke,et al.  The effect of misspecifying the random-effects distribution in linear mixed models for longitudinal data , 1997 .

[11]  Alan Agresti,et al.  Examples in which misspecification of a random effects distribution reduces efficiency, and possible remedies , 2004, Comput. Stat. Data Anal..

[12]  M Davidian,et al.  Linear Mixed Models with Flexible Distributions of Random Effects for Longitudinal Data , 2001, Biometrics.

[13]  Carla Rampichini,et al.  Multilevel models for the evaluation of educational institutions: a review , 2009 .

[14]  Roel Bosker,et al.  Multilevel analysis : an introduction to basic and advanced multilevel modeling , 1999 .

[15]  Harvey Goldstein,et al.  Multilevel Modeling of Social Segregation , 2012 .

[16]  Harvey Goldstein,et al.  Handbook of multilevel analysis , 2008 .

[17]  G Molenberghs,et al.  The impact of a misspecified random‐effects distribution on the estimation and the performance of inferential procedures in generalized linear mixed models , 2008, Statistics in medicine.

[18]  C. McCulloch,et al.  Misspecifying the Shape of a Random Effects Distribution: Why Getting It Wrong May Not Matter , 2011, 1201.1980.

[19]  M. Aitkin A General Maximum Likelihood Analysis of Variance Components in Generalized Linear Models , 1999, Biometrics.

[20]  Adam Loy,et al.  Diagnostic tools for hierarchical linear models , 2013 .

[21]  Ariel Alonso,et al.  A Note on the Indeterminacy of the Random-Effects Distribution in Hierarchical Models , 2010 .

[22]  Emmanuel Lesaffre,et al.  Generalized linear mixed model with a penalized Gaussian mixture as a random effects distribution , 2008, Comput. Stat. Data Anal..

[23]  Geert Molenberghs,et al.  Type I and Type II Error Under Random‐Effects Misspecification in Generalized Linear Mixed Models , 2007, Biometrics.

[24]  Harvey Goldstein,et al.  MCMC Sampling for a Multilevel Model With Nonindependent Residuals Within and Between Cluster Units , 2010 .

[25]  Carla Rampichini,et al.  Bayesian estimation with INLA for logistic multilevel models , 2013 .

[26]  Djalil Chafaï,et al.  Comparison of nonparametric methods in nonlinear mixed effects models , 2009, Comput. Stat. Data Anal..

[27]  Leonardo Grilli,et al.  Differential Variability of Test Scores among Schools: A Multilevel Analysis of the Fifth-Grade INVALSI Test Using Heteroscedastic Random Effects. , 2011 .

[28]  Jeroen K. Vermunt,et al.  6. The Simultaneous Decision(s) about the Number of Lower- and Higher-Level Classes in Multilevel Latent Class Analysis , 2010 .

[29]  Jeffrey S. Simonoff,et al.  The SAGE Handbook of Multilevel Modeling , 2013 .

[30]  H. Goldstein,et al.  The limitations of using school league tables to inform school choice , 2009 .

[31]  P. Heagerty,et al.  Misspecified maximum likelihood estimates and generalised linear mixed models , 2001 .

[32]  H. Goldstein Multilevel Statistical Models , 2006 .

[33]  Francesca Ieva,et al.  Semiparametric Bayesian models for clustering and classification in the presence of unbalanced in‐hospital survival , 2014 .

[34]  T. Lewis,et al.  Outliers in multilevel data , 1998 .

[35]  Dipak K. Dey,et al.  Skew random effects in multilevel binomial models , 2008 .

[36]  Jeroen K. Vermunt,et al.  7. Multilevel Latent Class Models , 2003 .

[37]  Jeroen K. Vermunt,et al.  Multilevel Growth Mixture Models for Classifying Groups , 2010 .

[38]  Mary Lesperance,et al.  Computational Statistics and Data Analysis Nonparametric Estimation of the Mixing Distribution in Logistic Regression Mixed Models with Random Intercepts and Slopes , 2022 .

[39]  Andrew Gelman,et al.  Data Analysis Using Regression and Multilevel/Hierarchical Models , 2006 .

[40]  Cécile Proust-Lima,et al.  Robustness of the linear mixed model to misspecified error distribution , 2007, Comput. Stat. Data Anal..

[41]  L. Grilli,et al.  Bayesian estimation with integrated nested Laplace approximation for binary logit mixed models , 2015 .

[42]  Luigi Salmaso,et al.  Statistical methods for the evaluation of educational services and quality of products , 2009 .

[43]  Carla Rampichini,et al.  The Role of Sample Cluster Means in Multilevel Models , 2011 .

[44]  A. Gelman Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper) , 2004 .

[45]  G. Verbeke,et al.  A Linear Mixed-Effects Model with Heterogeneity in the Random-Effects Population , 1996 .

[46]  Anthony S. Bryk,et al.  Hierarchical Linear Models: Applications and Data Analysis Methods , 1992 .

[47]  J. Rao Small Area Estimation , 2003 .

[48]  On Lange and Ryan's plotting technique for diagnosing non-normality of random effects , 2005 .

[49]  Kerrie Mengersen,et al.  Dirichlet process mixture models for unsupervised clustering of symptoms in Parkinson's disease , 2012 .

[50]  Xianzheng Huang,et al.  Detecting random-effects model misspecification via coarsened data , 2011, Comput. Stat. Data Anal..

[51]  Eugene Demidenko,et al.  Mixed Models: Theory and Applications with R , 2013 .

[52]  J. Berkhof,et al.  Diagnostic Checks for Multilevel Models , 2008 .

[53]  Jee-Seon Kim,et al.  Multilevel Modeling with Correlated Effects , 2007 .

[54]  Cora J. M. Maas,et al.  Robustness issues in multilevel regression analysis , 2004 .

[55]  Understanding Uncertainty in School League Tables , 2011 .

[56]  Geert Molenberghs,et al.  The gradient function as an exploratory goodness-of-fit assessment of the random-effects distribution in mixed models. , 2013, Biostatistics.

[57]  Geert Verbeke,et al.  A comparison of methods for estimating the random effects distribution of a linear mixed model , 2010, Statistical methods in medical research.

[58]  J. Wakefield,et al.  Bayesian inference for generalized linear mixed models. , 2010, Biostatistics.

[59]  John Hinde,et al.  Multivariate generalized linear mixed models with semi-nonparametric and smooth nonparametric random effects densities , 2012, Stat. Comput..

[60]  Bruno Arpino,et al.  Assessing the quality of institutions’ rankings obtained through multilevel linear regression models , 2009 .

[61]  Donald Hedeker,et al.  Modeling between‐subject and within‐subject variances in ecological momentary assessment data using mixed‐effects location scale models , 2012, Statistics in medicine.

[62]  Bengt Muthén,et al.  Latent Variable Analysis: Growth Mixture Modeling and Related Techniques for Longitudinal Data , 2004 .

[63]  Francesco Bartolucci,et al.  Assessment of School Performance Through a Multilevel Latent Markov Rasch Model , 2009, 0909.4961.

[64]  Francesca Ieva,et al.  Nonlinear nonparametric mixed-effects models for unsupervised classification , 2013, Comput. Stat..

[65]  Gerhard Tutz,et al.  Clustering in linear mixed models with approximate Dirichlet process mixtures using EM algorithm , 2013 .

[66]  J. Neuhaus,et al.  Prediction of Random Effects in Linear and Generalized Linear Models under Model Misspecification , 2011, Biometrics.

[67]  Wei Shen,et al.  Empirical Bayes Estimation via the Smoothing by Roughening Approach , 1999 .

[68]  Xianzheng Huang,et al.  Diagnosis of Random‐Effect Model Misspecification in Generalized Linear Mixed Models for Binary Response , 2009, Biometrics.

[69]  Sophia Rabe-Hesketh,et al.  Correcting for covariate measurement error in logistic regression using nonparametric maximum likelihood estimation , 2003 .

[70]  David Kaplan,et al.  The Sage handbook of quantitative methodology for the social sciences , 2004 .

[71]  J. Besag,et al.  Bayesian image restoration, with two applications in spatial statistics , 1991 .