Covariance estimators for generalized estimating equations (GEE) in longitudinal analysis with small samples

Generalized estimating equations (GEE) is a general statistical method to fit marginal models for longitudinal data in biomedical studies. The variance-covariance matrix of the regression parameter coefficients is usually estimated by a robust "sandwich" variance estimator, which does not perform satisfactorily when the sample size is small. To reduce the downward bias and improve the efficiency, several modified variance estimators have been proposed for bias-correction or efficiency improvement. In this paper, we provide a comprehensive review on recent developments of modified variance estimators and compare their small-sample performance theoretically and numerically through simulation and real data examples. In particular, Wald tests and t-tests based on different variance estimators are used for hypothesis testing, and the guideline on appropriate sample sizes for each estimator is provided for preserving type I error in general cases based on numerical results. Moreover, we develop a user-friendly R package "geesmv" incorporating all of these variance estimators for public usage in practice.

[1]  T. Derouen,et al.  A Covariance Estimator for GEE with Improved Small‐Sample Properties , 2001, Biometrics.

[2]  L S Freedman,et al.  Conditional logistic regression with sandwich estimators: application to a meta-analysis. , 1998, Biometrics.

[3]  N. Neerchal,et al.  Small Sample Correction for the Variance of GEE Estimators , 2003 .

[4]  K Y Liang,et al.  Sample size calculations for studies with correlated observations. , 1997, Biometrics.

[5]  M. Piedmonte,et al.  Small sample validity of latent variable models for correlated binary data , 1994 .

[6]  Yan Ma,et al.  Beyond Repeated-Measures Analysis of Variance: Advanced Statistical Methods for the Analysis of Longitudinal Data in Anesthesia Research , 2011, Regional Anesthesia & Pain Medicine.

[7]  W. Pan,et al.  Small‐sample adjustments in using the sandwich variance estimator in generalized estimating equations , 2002, Statistics in medicine.

[8]  Qi Long,et al.  Modified robust variance estimator for generalized estimating equations with improved small‐sample performance , 2011, Statistics in medicine.

[9]  W. Pan,et al.  Small‐sample performance of the robust score test and its modifications in generalized estimating equations , 2005, Statistics in medicine.

[10]  N. Breslow,et al.  Regression analysis of correlated binary data : some small sample results for the estimating equation approach , 1992 .

[11]  H. White,et al.  Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties☆ , 1985 .

[12]  Myunghee C. Paik,et al.  Repeated measurement analysis for nonnormal data in small samples , 1988 .

[13]  Peng Li,et al.  Small sample performance of bias‐corrected sandwich estimators for cluster‐randomized trials with binary outcomes , 2015, Statistics in medicine.

[14]  W. Shih,et al.  Sample Size and Power Calculations for Periodontal and Other Studies with Clustered Samples Using the Method of Generalized Estimating Equations , 1997 .

[15]  Annie Qu,et al.  Penalized Generalized Estimating Equations for High‐Dimensional Longitudinal Data Analysis , 2012, Biometrics.

[16]  Steven Teerenstra,et al.  Sample Size Considerations for GEE Analyses of Three‐Level Cluster Randomized Trials , 2010, Biometrics.

[17]  B. Lindsay,et al.  Improving generalised estimating equations using quadratic inference functions , 2000 .

[18]  R. Potthoff,et al.  A generalized multivariate analysis of variance model useful especially for growth curve problems , 1964 .

[19]  M. Fay,et al.  Small‐Sample Adjustments for Wald‐Type Tests Using Sandwich Estimators , 2001, Biometrics.

[20]  W. Pan On the robust variance estimator in generalised estimating equations , 2001 .

[21]  M. Sherman,et al.  A comparison between bootstrap methods and generalized estimating equations for correlated outcomes in generalized linear models , 1997 .

[22]  Martin Crowder,et al.  On the use of a working correlation matrix in using generalised linear models for repeated measures , 1995 .

[23]  Alireza Atri,et al.  An Overview of Longitudinal Data Analysis Methods for Neurological Research , 2011, Dementia and Geriatric Cognitive Disorders Extra.

[24]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[25]  V. M. Chinchilli,et al.  Small sample characteristics of generalized estimating equations , 1995 .

[26]  Z. Feng,et al.  A comparison of statistical methods for clustered data analysis with Gaussian error. , 1996, Statistics in medicine.

[27]  Stuart R. Lipsitz,et al.  Using the jackknife to estimate the variance of regression estimators from repeated measures studies , 1990 .

[28]  R. W. Wedderburn Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method , 1974 .

[29]  P Diehr,et al.  Selected statistical issues in group randomized trials. , 2001, Annual review of public health.

[30]  P. Thall,et al.  Some covariance models for longitudinal count data with overdispersion. , 1990, Biometrics.

[31]  R. Carroll,et al.  A Note on the Efficiency of Sandwich Covariance Matrix Estimation , 2001 .

[32]  M. Gosho,et al.  Robust Covariance Estimator for Small-Sample Adjustment in the Generalized Estimating Equations: A Simulation Study , 2014 .