Generalized Estimating Equations

Background: Generalized estimating equations (GEE) are an extension of generalized linear models (GLM) in that they allow adjusting for correlations between observations. A major strength of GEE is that they do not require the correct specification of the multivariate distribution but only of the mean structure. Objectives: Several concerns have been raised about the validity of GEE when applied to dichotomous dependent variables. In this contribution, we summarize the theoretical findings concerning efficiency and validity of GEE. Methods: We introduce the GEE in a formal way, summarize general findings on the choice of the working correlation matrix, and show the existence of a dilemma for the optimal choice of the working correlation matrix for dichotomous dependent variables. Results: Biological and statistical arguments for choosing a specific working correlation matrix are given. Three approaches are described for overcoming the range restriction of the correlation coefficient. Conclusions: The three approaches described in this article for overcoming the range restrictions for dichotomous dependent variables in GEE models provide a simple and practical way for use in applications.

[1]  You-Gan Wang,et al.  Working‐correlation‐structure identification in generalized estimating equations , 2009, Statistics in medicine.

[2]  R. Prentice,et al.  Correlated binary regression with covariates specific to each binary observation. , 1988, Biometrics.

[3]  P. McCullagh,et al.  Generalized Linear Models , 1972, Predictive Analytics.

[4]  R. Carroll,et al.  A Note on the Efficiency of Sandwich Covariance Matrix Estimation , 2001 .

[5]  C. Zwerling,et al.  Smoke alarms by type and battery life in rural households: a randomized controlled trial. , 2008, American journal of preventive medicine.

[6]  Myunghee C. Paik,et al.  The generalized estimating equation approach when data are not missing completely at random , 1997 .

[7]  L. Fahrmeir,et al.  Regression analysis of forest damage by marginal models for correlated ordinal responses , 1996, Environmental and Ecological Statistics.

[8]  I. König,et al.  Sample Size Calculations for Controlled Clinical Trials Using Generalized Estimating Equations (GEE) , 2004, Methods of Information in Medicine.

[9]  Eva Cantoni,et al.  A robust approach to longitudinal data analysis , 2004 .

[10]  John E. Dennis,et al.  Numerical methods for unconstrained optimization and nonlinear equations , 1983, Prentice Hall series in computational mathematics.

[11]  Kang-Mo Jung Local Influence in Generalized Estimating Equations , 2008 .

[12]  D. Botter,et al.  Diagnostic techniques in generalized estimating equations , 2007 .

[13]  Andreas Ziegler,et al.  Generalized Estimating Equations in Controlled Clinical Trials: Hypotheses Testing , 2004 .

[14]  V. Carey,et al.  Criteria for Working–Correlation–Structure Selection in GEE , 2007 .

[15]  P. Mitchell,et al.  Retinal vessel caliber and the long-term incidence of age-related cataract: the Blue Mountains Eye Study. , 2008, Ophthalmology.

[16]  Barry McDonald,et al.  Estimating Logistic Regression Parameters for Bivariate Binary Data , 1993 .

[17]  R. Dennis Cook,et al.  Assessing influence on regression coefficients in generalized linear models , 1989 .

[18]  John S. Preisser,et al.  Deletion diagnostics for marginal mean and correlation model parameters in estimating equations , 2007, Stat. Comput..

[19]  Scott Evans,et al.  A comparison of goodness of fit tests for the logistic GEE model , 2005, Statistics in medicine.

[20]  M. Gail,et al.  Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates , 1984 .

[21]  M. Nunn,et al.  Efficacy of a fluoridated hydrogen peroxide-based mouthrinse for the treatment of gingivitis: a randomized clinical trial. , 2004, Journal of periodontology.

[22]  A. Dobson An introduction to generalized linear models , 1990 .

[23]  N. Rao Chaganty,et al.  Efficiency of generalized estimating equations for binary responses , 2004 .

[24]  B. Leroux,et al.  Efficiency of regression estimates for clustered data. , 1996, Biometrics.

[25]  Andreas Ziegler,et al.  Generalized estimating equations and regression diagnostics for longitudinal controlled clinical trials: A case study , 2012, Comput. Stat. Data Anal..

[26]  Jost B Jonas,et al.  GEE approaches to marginal regression models for medical diagnostic tests , 2004, Statistics in medicine.

[27]  L. Fahrmeir,et al.  Multivariate statistical modelling based on generalized linear models , 1994 .

[28]  S. Zeger,et al.  Multivariate Regression Analyses for Categorical Data , 1992 .

[29]  V. Carey,et al.  Working correlation structure misspecification, estimation and covariate design: Implications for generalised estimating equations performance , 2003 .

[30]  James Rochon,et al.  Application of GEE procedures for sample size calculations in repeated measures experiments , 1998 .

[31]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[32]  M. Paik,et al.  Multiple imputation methods for the missing covariates in generalized estimating equation. , 1997, Biometrics.

[33]  W Pan,et al.  Model Selection in Estimating Equations , 2001, Biometrics.

[34]  J. Hanley,et al.  Statistical analysis of correlated data using generalized estimating equations: an orientation. , 2003, American journal of epidemiology.

[35]  K Y Liang,et al.  Longitudinal data analysis for discrete and continuous outcomes. , 1986, Biometrics.

[36]  Tze Leung Lai,et al.  Nonparametric estimation in nonlinear mixed effects models , 2003 .

[37]  John S. Preisser,et al.  A SAS/IML software program for GEE and regression diagnostics , 2006, Comput. Stat. Data Anal..

[38]  Christian Gourieroux,et al.  Statistics and econometric models , 1995 .

[39]  A. Ziegler The Different Parameterizations of the GEE1 and the GEE2 , 1995 .

[40]  P. McCullagh,et al.  Generalized Linear Models, 2nd Edn. , 1990 .

[41]  Wen Hsiang Wei,et al.  The mean-shift outlier model in general weighted regression and its applications , 1999 .

[42]  W. Pan Akaike's Information Criterion in Generalized Estimating Equations , 2001, Biometrics.

[43]  Scott L. Zeger,et al.  The analysis of binary longitudinal data with time independent covariates , 1985 .

[44]  A. Ziegler,et al.  Analysis of pregnancy and other factors on detection of human papilloma virus (HPV) infection using weighted estimating equations for follow‐up data , 2003, Statistics in medicine.

[45]  J. Hanley,et al.  GEE analysis of negatively correlated binary responses: a caution. , 2000, Statistics in medicine.

[46]  Wei Pan,et al.  SELECTING THE WORKING CORRELATION STRUCTURE IN GENERALIZED ESTIMATING EQUATIONS WITH APPLICATION TO THE LUNG HEALTH STUDY , 2002 .

[47]  J. Hardin,et al.  Generalized Estimating Equations , 2002 .

[48]  C. Kastner,et al.  The Generalised Estimating Equations: An Annotated Bibliography , 1998 .

[49]  B. Qaqish,et al.  A note on deletion diagnostics for estimating equations , 2008 .

[50]  Leon Aarons,et al.  Sample Size Calculations Based on Generalized Estimating Equations for Population Pharmacokinetic Experiments , 2006, Journal of biopharmaceutical statistics.

[51]  J. Robins,et al.  Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .

[52]  J. Robins,et al.  Analysis of semiparametric regression models for repeated outcomes in the presence of missing data , 1995 .

[53]  C. Diehm,et al.  The efficacy and safety of a coumarin-/troxerutin-combination (SB-LOT) in patients with chronic venous insufficiency: a double blind placebo-controlled randomised study. , 2002, VASA. Zeitschrift fur Gefasskrankheiten.

[54]  A. Rotnitzky,et al.  A note on the bias of estimators with missing data. , 1994, Biometrics.

[55]  L. Ryan The use of generalized estimating equations for risk assessment in developmental toxicity. , 1992, Risk analysis : an official publication of the Society for Risk Analysis.

[56]  Guoqi Qian,et al.  Selection of Working Correlation Structure and Best Model in GEE Analyses of Longitudinal Data , 2007, Commun. Stat. Simul. Comput..

[57]  Martin Crowder,et al.  On the use of a working correlation matrix in using generalised linear models for repeated measures , 1995 .

[58]  N. Laird,et al.  A likelihood-based method for analysing longitudinal binary responses , 1993 .

[59]  M. Paik,et al.  Generalized estimating equation model for binary outcomes with missing covariates. , 1997, Biometrics.

[60]  X M Tu,et al.  Power analyses for longitudinal trials and other clustered designs , 2004, Statistics in medicine.

[61]  C. Gouriéroux,et al.  PSEUDO MAXIMUM LIKELIHOOD METHODS: THEORY , 1984 .

[62]  A Ziegler,et al.  Familial associations of lipid profiles: a generalized estimating equations approach. , 2000, Statistics in medicine.

[63]  Gary A. Ballinger,et al.  Using Generalized Estimating Equations for Longitudinal Data Analysis , 2004 .

[64]  T. Louis,et al.  A Note on Marginal Linear Regression with Correlated Response Data , 2000 .

[65]  David A. Belsley,et al.  Regression Analysis and its Application: A Data-Oriented Approach.@@@Applied Linear Regression.@@@Regression Diagnostics: Identifying Influential Data and Sources of Collinearity , 1981 .

[66]  M. Pepe,et al.  A cautionary note on inference for marginal regression models with longitudinal data and general correlated response data , 1994 .

[67]  M. Piedmonte,et al.  On some small sample properties of generalized estimating equationEstimates for multivariate dichotomous outcomes , 1992 .

[68]  P. Diggle,et al.  Analysis of Longitudinal Data. , 1997 .

[69]  KyungMann Kim,et al.  Contrasting treatment‐specific survival using double‐robust estimators , 2012 .

[70]  Harvey J Cohen,et al.  An Overview of Variance Inflation Factors for Sample-Size Calculation , 2003, Evaluation & the health professions.

[71]  Michael G. Kenward,et al.  Design and Analysis of Cross-Over Trials, Second Edition , 2003 .

[72]  Y. Qu,et al.  A SAS macro for stepwise correlated binary regression. , 1996, Computer methods and programs in biomedicine.

[73]  Elvezio Ronchetti,et al.  Variable Selection for Marginal Longitudinal Generalized Linear Models , 2003, Biometrics.

[74]  Margaret T May,et al.  Statistical Methods for the Analysis of Repeated Measurements.Charles S Davis. Heidelberg: Springer Verlag, 2002, pp. 415, £59.50 (HB) ISBN: 0-387-95370-1. , 2003 .