Working correlation structure misspecification, estimation and covariate design: Implications for generalised estimating equations performance

The method of generalised estimating equations for regression modelling of clustered outcomes allows for specification of a working matrix that is intended to approximate the true correlation matrix of the observations. We investigate the asymptotic relative efficiency of the generalised estimating equation for the mean parameters when the correlation parameters are estimated by various methods. The asymptotic relative efficiency depends on three features of the analysis, namely (i) the discrepancy between the working correlation structure and the unobservable true correlation structure, (ii) the method by which the correlation parameters are estimated and (iii) the 'design', by which we refer to both the structures of the predictor matrices within clusters and distribution of cluster sizes. Analytical and numerical studies of realistic data-analysis scenarios show that choice of working covariance model has a substantial impact on regression estimator efficiency. Protection against avoidable loss of efficiency associated with covariance misspecification is obtained when a 'Gaussian estimation' pseudolikelihood procedure is used with an AR(1) structure.

[1]  V. P. Godambe An Optimum Property of Regular Maximum Likelihood Estimation , 1960 .

[2]  Martin Crowder,et al.  Gaussian Estimation for Correlated Binomial Data , 1985 .

[3]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[4]  D. Ruppert,et al.  Transformation and Weighting in Regression , 1988 .

[5]  L. Zhao,et al.  Correlated binary regression using a quadratic exponential model , 1990 .

[6]  L. Zhao,et al.  Estimating equations for parameters in means and covariances of multivariate discrete and continuous responses. , 1991, Biometrics.

[7]  Lue Ping Zhao,et al.  Multivariate Mean Parameter Estimation by Using a Partly Exponential Model , 1992 .

[8]  S. Zeger,et al.  Multivariate Regression Analyses for Categorical Data , 1992 .

[9]  P S Albert,et al.  A generalized estimating equations approach for spatially correlated binary data: applications to the analysis of neuroimaging data. , 1995, Biometrics.

[10]  G. Fitzmaurice,et al.  A caveat concerning independence estimating equations with multivariate binary data. , 1995, Biometrics.

[11]  Martin Crowder,et al.  On the use of a working correlation matrix in using generalised linear models for repeated measures , 1995 .

[12]  B. Leroux,et al.  Efficiency of regression estimates for clustered data. , 1996, Biometrics.

[13]  Nonlinear Models for Repeated Measurement Data , 1996 .

[14]  N. Rao Chaganty,et al.  An alternative approach to the analysis of longitudinal data via generalized estimating equations , 1997 .

[15]  N. Rao Chaganty,et al.  Analysis of Serially Correlated Data Using Quasi-Least Squares , 1998 .

[16]  J. Shults,et al.  On eliminating the asymptotic bias in the quasi-least squares estimate of the correlation parameter , 1999 .

[17]  Kalyan Das,et al.  Miscellanea. On the efficiency of regression estimators in generalised linear models for longitudinal data , 1999 .

[18]  S L Zeger,et al.  Multivariate Continuation Ratio Models: Connections and Caveats , 2000, Biometrics.