Efficiency of regression estimates for clustered data.

Statistical methods for clustered data, such as generalized estimating equations (GEE) and generalized least squares (GLS), require selecting a correlation or convariance structure to specify the dependence between observations within a cluster. Valid regression estimates can be obtained that do not depend on correct specification of the true correlation, but inappropriate specifications can result in a loss of efficiency. We derive general expressions for the asymptotic relative efficiency of GEE and GLS estimators under nested correlation structures. Efficiency is shown to depend on the covariate distribution, the cluster sizes, the response variable correlation, and the regression parameters. The results demonstrate that efficiency is quite sensitive to the between- and within-cluster variation of the covariates, and provide useful characterizations of models for which upper and lower efficiency bounds are attained. Efficiency losses for simple working correlation matrices, such as independence, can be large even for small to moderate correlations and cluster sizes.

[1]  Barry McDonald,et al.  Estimating Logistic Regression Parameters for Bivariate Binary Data , 1993 .

[2]  N. Breslow,et al.  Regression analysis of correlated binary data : some small sample results for the estimating equation approach , 1992 .

[3]  Simo Puntanen,et al.  The Equality of the Ordinary Least Squares Estimator and the Best Linear Unbiased Estimator , 1989 .

[4]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[5]  S. Ban,et al.  Application of generalized estimating equations to a study of in vitro radiation sensitivity. , 1993, Biometrics.

[6]  R. Royall Model robust confidence intervals using maximum likelihood estimators , 1986 .

[7]  G. Fitzmaurice,et al.  A caveat concerning independence estimating equations with multivariate binary data. , 1995, Biometrics.

[8]  E. J. Hannan,et al.  Multiple time series , 1970 .

[9]  M. Piedmonte,et al.  On some small sample properties of generalized estimating equationEstimates for multivariate dichotomous outcomes , 1992 .

[10]  S. Lipsitz,et al.  Performance of generalized estimating equations in practical situations. , 1994, Biometrics.

[11]  Geoffrey S. Watson,et al.  The inefficiency of least squares , 1975 .

[12]  S. Zeger,et al.  Multivariate Regression Analyses for Categorical Data , 1992 .

[13]  A. Scott,et al.  The Effect of Two-Stage Sampling on Ordinary Least Squares Methods , 1982 .

[14]  J. Neuhaus Estimation efficiency and tests of covariate effects with clustered binary data. , 1993, Biometrics.