Estimating negative variance components from Gaussian and non-Gaussian data: A mixed models approach

The occurrence of negative variance components is a reasonably well understood phenomenon in the case of linear models for hierarchical data, such as variance-component models in designed experiments or linear mixed models for longitudinal data. In many cases, such negative variance components can be translated as negative within-unit correlations. It is shown that negative variance components, with corresponding negative associations, can occur in hierarchical models for non-Gaussian outcomes as well, such as repeated binary data or counts. While this feature poses no problem for marginal models, in which the mean and correlation functions are modeled directly and separately, the issue is more complicated in, for example, generalized linear mixed models. This owes in part to the non-linear nature of the link function, non-constant residual variance stemming from the mean-variance link, and the resulting lack of closed-form expressions for the marginal correlations. It is established that such negative variance components in generalized linear mixed models can occur in practice and that they can be estimated using standard statistical software. Marginal-correlation functions are derived. Important implications for interpretation and model choice are discussed. Simulations and the analysis of data from a developmental toxicity experiment underscore these results.

[1]  D. Tim Holt,et al.  The Official Statistics Olympic Challenge , 2007 .

[2]  G. Molenberghs,et al.  A note on a hierarchical interpretation for negative variance components , 2011 .

[3]  Joel C. Kleinman,et al.  Proportions with Extraneous Variance: Single and Independent Samples , 1973 .

[4]  J. G. Skellam A Probability Distribution Derived from the Binomial Distribution by Regarding the Probability of Success as Variable between the Sets of Trials , 1948 .

[5]  On asymptotically optimal tests , 1988 .

[6]  G. Molenberghs,et al.  Likelihood Ratio, Score, and Wald Tests in a Constrained Parameter Space , 2007 .

[7]  Geert Molenberghs,et al.  Misspecifying the likelihood for clustered binary data , 1998 .

[8]  Geert Molenberghs,et al.  A hierarchical modeling approach for risk assessment in developmental toxicity studies , 2006, Comput. Stat. Data Anal..

[9]  H. Chernoff On the Distribution of the Likelihood Ratio , 1954 .

[10]  Geert Molenberghs,et al.  A pairwise likelihood approach to estimation in multilevel probit models , 2004, Comput. Stat. Data Anal..

[11]  Geert Molenberghs,et al.  Marginal correlation from an extended random-effects model for repeated and overdispersed counts , 2011 .

[12]  Ross L. Prentice,et al.  Binary Regression Using an Extended Beta-Binomial Distribution, with Discussion of Correlation Induced by Covariate Measurement Errors , 1986 .

[13]  N. Breslow,et al.  Approximate inference in generalized linear mixed models , 1993 .

[14]  G. Molenberghs,et al.  An extended random-effects approach to modeling repeated, overdispersed count data , 2007, Lifetime data analysis.

[15]  Williams Da,et al.  The analysis of binary responses from toxicological experiments involving reproduction and teratogenicity. , 1975 .

[16]  Geert Molenberghs,et al.  Choice of units of analysis and modeling strategies in multilevel hierarchical models , 2004, Comput. Stat. Data Anal..

[17]  G. Molenberghs,et al.  Linear Mixed Models for Longitudinal Data , 2001 .

[18]  D. Commenges,et al.  Tests of Homogeneity for Generalized Linear Models , 1995 .

[19]  D. Stram,et al.  Variance components testing in the longitudinal mixed effects model. , 1994, Biometrics.

[20]  Noreen Goldman,et al.  An assessment of estimation procedures for multilevel models with binary responses , 1995 .

[21]  Geert Molenberghs,et al.  The Use of Score Tests for Inference on Variance Components , 2003, Biometrics.

[22]  Geert Molenberghs,et al.  A family of tests to detect misspecifications in the random-effects structure of generalized linear mixed models , 2008, Comput. Stat. Data Anal..

[23]  J. Ware,et al.  Random-effects models for longitudinal data. , 1982, Biometrics.

[24]  Fritz Haber,et al.  Zur Geschichte des Gaskrieges , 1924 .

[25]  D. A. Kenny,et al.  Consequences of violating the independence assumption in analysis of variance. , 1986 .

[26]  Kurt Hornik,et al.  On the generation of correlated artificial binary data , 1998 .

[27]  D. A. Kenny,et al.  The statistical analysis of data from small groups. , 2002, Journal of personality and social psychology.

[28]  T. Britton Tests to detect clustering of infected individuals within families. , 1997, Biometrics.

[29]  John A. Nelder,et al.  The interpretation of negative components of variance , 1954 .