Statistical analysis of correlated data using generalized estimating equations: an orientation.

The method of generalized estimating equations (GEE) is often used to analyze longitudinal and other correlated response data, particularly if responses are binary. However, few descriptions of the method are accessible to epidemiologists. In this paper, the authors use small worked examples and one real data set, involving both binary and quantitative response data, to help end-users appreciate the essence of the method. The examples are simple enough to see the behind-the-scenes calculations and the essential role of weighted observations, and they allow nonstatisticians to imagine the calculations involved when the GEE method is applied to more complex multivariate data.

[1]  W. Haenszel,et al.  Statistical aspects of the analysis of data from retrospective studies of disease. , 1959, Journal of the National Cancer Institute.

[2]  P. Armitage,et al.  Statistical methods in medical research. , 1972 .

[3]  R. W. Wedderburn Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method , 1974 .

[4]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[5]  M. Gail,et al.  Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates , 1984 .

[6]  S Wacholder,et al.  Conditions for confounding of the risk ratio and of the odds ratio. , 1985, American journal of epidemiology.

[7]  S Wacholder,et al.  Binomial regression in GLIM: estimating risk ratios and risk differences. , 1986, American journal of epidemiology.

[8]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[9]  K Y Liang,et al.  Longitudinal data analysis for discrete and continuous outcomes. , 1986, Biometrics.

[10]  Stephen M. Stigler The History of Statistics: The Measurement of Uncertainty before 1900 , 1986 .

[11]  H. Goldstein Multilevel Statistical Models , 2006 .

[12]  Stephen M. Stigler,et al.  The History of Statistics: The Measurement of Uncertainty before 1900 , 1986 .

[13]  B Rosner,et al.  Significance testing for correlated binary outcome data. , 1988, Biometrics.

[14]  Norman E. Breslow,et al.  Tests of Hypotheses in Overdispersed Poisson Regression and other Quasi-Likelihood Models , 1990 .

[15]  M. Scott,et al.  Predisposition of individuals and families in Mexico to heavy infection with Ascaris lumbricoides and Trichuris trichiura. , 1990, Transactions of the Royal Society of Tropical Medicine and Hygiene.

[16]  H C Van Houwelingen,et al.  Risk ratio and rate ratio estimation in case-cohort designs: hypertension and cardiovascular mortality. , 1993, Statistics in medicine.

[17]  A. Augustin,et al.  Vitamin A supplementation and increased prevalence of childhood diarrhoea and acute respiratory infections , 1993, The Lancet.

[18]  A Sommer,et al.  Estimation of design effects and diarrhea clustering within households and villages. , 1993, American journal of epidemiology.

[19]  A Donner,et al.  Methods for comparing event rates in intervention studies when the unit of allocation is a cluster. , 1994, American journal of epidemiology.

[20]  P. Diggle Analysis of Longitudinal Data , 1995 .

[21]  H. V. Houwelingen,et al.  Author's reply: risk ratio and estimation in case-cohort designs: hypertension and cardiovasculair mortality. , 1995 .

[22]  N Breslow,et al.  Approximate hierarchical modelling of discrete data in epidemiology , 1998, Statistical methods in medical research.

[23]  P. Burton,et al.  Extending the simple linear regression model to account for correlated responses: an introduction to generalized estimating equations and multi-level mixed modelling. , 1998, Statistics in medicine.

[24]  Stuart R. Lipsitz,et al.  Review of Software to Fit Generalized Estimating Equation Regression Models , 1999 .

[25]  A. Winsor Sampling techniques. , 2000, Nursing times.

[26]  G. Colditz,et al.  Family dinner and diet quality among older children and adolescents. , 2000, Archives of family medicine.

[27]  Pranab Kumar Sen,et al.  Within‐cluster resampling , 2001 .

[28]  W. Pan,et al.  Small‐sample adjustments in using the sandwich variance estimator in generalized estimating equations , 2002, Statistics in medicine.