The impact of a misspecified random‐effects distribution on the estimation and the performance of inferential procedures in generalized linear mixed models

Estimation in generalized linear mixed models (GLMMs) is often based on maximum likelihood theory, assuming that the underlying probability model is correctly specified. However, the validity of this assumption is sometimes difficult to verify. In this paper we study, through simulations, the impact of misspecifying the random-effects distribution on the estimation and hypothesis testing in GLMMs. It is shown that the maximum likelihood estimators are inconsistent in the presence of misspecification. The bias induced in the mean-structure parameters is generally small, as far as the variability of the underlying random-effects distribution is small as well. However, the estimates of this variability are always severely biased. Given that the variance components are the only tool to study the variability of the true distribution, it is difficult to assess whether problems in the estimation of the mean structure occur. The type I error rate and the power of the commonly used inferential procedures are also severely affected. The situation is aggravated if more than one random effect is included in the model. Further, we propose to deal with possible misspecification by way of sensitivity analysis, considering several random-effects distributions. All the results are illustrated using data from a clinical trial in schizophrenia.

[1]  G. Molenberghs,et al.  Linear Mixed Models for Longitudinal Data , 2001 .

[2]  D. Burr,et al.  A Bayesian Semiparametric Model for Random-Effects Meta-Analysis , 2005 .

[3]  H. White Maximum Likelihood Estimation of Misspecified Models , 1982 .

[4]  Simon G Thompson,et al.  Flexible parametric models for random‐effects distributions , 2008, Statistics in medicine.

[5]  G. Verbeke,et al.  The effect of misspecifying the random-effects distribution in linear mixed models for longitudinal data , 1997 .

[6]  Alan Agresti,et al.  Examples in which misspecification of a random effects distribution reduces efficiency, and possible remedies , 2004, Comput. Stat. Data Anal..

[7]  M. Aitkin,et al.  Meta-analysis by random effect modelling in generalized linear models. , 1999, Statistics in medicine.

[8]  J. Nelder,et al.  Hierarchical Generalized Linear Models , 1996 .

[9]  Dankmar Böhning,et al.  Computer-Assisted Analysis of Mixtures and Applications: Meta-Analysis, Disease Mapping, and Others , 1999 .

[10]  Hani Doss,et al.  A meta‐analysis of studies on the association of the platelet PlA polymorphism of glycoprotein IIIa and risk of coronary heart disease , 2003, Statistics in medicine.

[11]  Geert Molenberghs,et al.  Validation of surrogate markers in multiple randomized clinical trials with repeated measurements: canonical correlation approach. , 2004 .

[12]  G. Molenberghs,et al.  Models for Discrete Longitudinal Data , 2005 .

[13]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[14]  M. I. David,et al.  Comparing Institutional Performance using Markov chain Monte Carlo Methods , 1998 .

[15]  J. Kalbfleisch,et al.  The effects of mixture distribution misspecification when fitting mixed-effects logistic models , 1992 .

[16]  P. Heagerty,et al.  Misspecified maximum likelihood estimates and generalised linear mixed models , 2001 .

[17]  Marie Davidian,et al.  A Monte Carlo EM algorithm for generalized linear mixed models with flexible random effects distribution. , 2002, Biostatistics.

[18]  Robert L. Strawderman,et al.  Use of the Probability Integral Transformation to Fit Nonlinear Mixed-Effects Models With Nonnormal Random Effects , 2006 .

[19]  P. Diggle Analysis of Longitudinal Data , 1995 .

[20]  J. Ware,et al.  Random-effects models for longitudinal data. , 1982, Biometrics.

[21]  M. Aitkin A General Maximum Likelihood Analysis of Variance Components in Generalized Linear Models , 1999, Biometrics.

[22]  P. Nurmi Mixture Models , 2008 .

[23]  Geert Molenberghs,et al.  Missing Data in Clinical Studies , 2007 .

[24]  D J Spiegelhalter,et al.  Flexible random‐effects models using Bayesian semi‐parametric models: applications to institutional comparisons , 2007, Statistics in medicine.

[25]  T Stijnen,et al.  Baseline risk as predictor of treatment benefit: three clinical meta-re-analyses. , 2000, Statistics in medicine.