Doubly Robust and Multiple-Imputation-Based Generalized Estimating Equations

Generalized estimating equations (GEE), proposed by Liang and Zeger (1986), provide a popular method to analyze correlated non-Gaussian data. When data are incomplete, the GEE method suffers from its frequentist nature and inferences under this method are valid only under the strong assumption that the missing data are missing completely at random. When response data are missing at random, two modifications of GEE can be considered, based on inverse-probability weighting or on multiple imputation. The weighted GEE (WGEE) method involves weighting observations by the inverse of their probability of being observed. Imputation methods involve filling in missing observations with values predicted by an assumed imputation model, multiple times. The so-called doubly robust (DR) methods involve both a model for the weights and a predictive model for the missing observations given the observed ones. To yield consistent estimates, WGEE needs correct specification of the dropout model while imputation-based methodology needs a correctly specified imputation model. DR methods need correct specification of either the weight or the predictive model, but not necessarily both. Focusing on incomplete binary repeated measures, we study the relative performance of the singly robust and doubly robust versions of GEE in a variety of correctly and incorrectly specified models using simulation studies. Data from a clinical trial in onychomycosis further illustrate the method.

[1]  L. Hunt,et al.  Missing Data in Clinical Studies , 2007 .

[2]  James M. Robins,et al.  Unified Methods for Censored Longitudinal Data and Causality , 2003 .

[3]  J. Dale Global cross-ratio models for bivariate, discrete, ordered responses. , 1986, Biometrics.

[4]  Eric R. Ziegel,et al.  Multivariate Statistical Modelling Based on Generalized Linear Models , 2002, Technometrics.

[5]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[6]  G. Molenberghs,et al.  Marginal modelling of multivariate categorical data. , 1999, Statistics in medicine.

[7]  M. Kenward,et al.  The analysis of longitudinal ordinal data with nonrandom drop-out , 1997 .

[8]  A. Rotnitzky Inverse probability weighted methods , 2008 .

[9]  Geert Molenberghs,et al.  A simulation study comparing weighted estimating equations with multiple imputation based estimating equations for longitudinal binary data , 2008, Comput. Stat. Data Anal..

[10]  J. Robins,et al.  Analysis of semiparametric regression models for repeated outcomes in the presence of missing data , 1995 .

[11]  G. Molenberghs,et al.  Models for Discrete Longitudinal Data , 2005 .

[12]  M. Kenward,et al.  A comparison of multiple imputation and doubly robust estimation for analyses with missing data , 2006 .

[13]  Donald Hedeker,et al.  Longitudinal Data Analysis , 2006 .

[14]  J. Robins,et al.  Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models , 1999 .

[15]  Geert Molenberghs,et al.  Last Observation Carried Forward: A Crystal Ball? , 2009, Journal of biopharmaceutical statistics.

[16]  R. Plackett A Class of Bivariate Distributions , 1965 .

[17]  G. Molenberghs,et al.  Marginal Modeling of Correlated Ordinal Data Using a Multivariate Plackett Distribution , 1994 .

[18]  Geert Molenberghs,et al.  PSEUDO-LIKELIHOOD ESTIMATION FOR INCOMPLETE DATA , 2011 .

[19]  S. Zeger,et al.  Multivariate Regression Analyses for Categorical Data , 1992 .

[20]  A. Winsor Sampling techniques. , 2000, Nursing times.

[21]  J. Robins,et al.  Doubly Robust Estimation in Missing Data and Causal Inference Models , 2005, Biometrics.

[22]  Roderick J. A. Little,et al.  Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models: Comment , 1999 .

[23]  Ana Ivelisse Avilés,et al.  Linear Mixed Models for Longitudinal Data , 2001, Technometrics.

[24]  J. Schafer Multiple Imputation in Multivariate Problems When the Imputation and Analysis Models Differ , 2003 .

[25]  P. Diggle Analysis of Longitudinal Data , 1995 .

[26]  D. Rubin,et al.  MULTIPLE IMPUTATIONS IN SAMPLE SURVEYS-A PHENOMENOLOGICAL BAYESIAN APPROACH TO NONRESPONSE , 2002 .

[27]  E. Lesaffre,et al.  A 12–week treatment for dermatophyte toe onychomycosis terbinafine 250mg/day vs. itraconazole 200mg/day—a double‐blind comparative trial , 1996, The British journal of dermatology.

[28]  D. Roberts,et al.  Prevalence of dermatophyte onychomycosis in the United Kingdom: Results of an omnibus survey , 1992, The British journal of dermatology.

[29]  A. Agresti,et al.  Simultaneously Modeling Joint and Marginal Distributions of Multivariate Categorical Responses , 1994 .

[30]  J. Schafer Multiple imputation: a primer , 1999, Statistical methods in medical research.

[31]  Xiao-Li Meng,et al.  Multiple-Imputation Inferences with Uncongenial Sources of Input , 1994 .

[32]  David J. Spiegelhalter,et al.  Analysis of longitudinal binary data from multiphase sampling , 1998 .

[33]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[34]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data: Little/Statistical Analysis with Missing Data , 2002 .

[35]  Rhian Daniel,et al.  On aspects of robustness and sensitivity in missing data methods , 2009 .

[36]  P. Diggle,et al.  Modelling multivariate binary data with alternating logistic regressions , 1993 .

[37]  D. Rubin INFERENCE AND MISSING DATA , 1975 .