Doubly Robust Estimation in Missing Data and Causal Inference Models

Summary The goal of this article is to construct doubly robust (DR) estimators in ignorable missing data and causal inference models. In a missing data model, an estimator is DR if it remains consistent when either (but not necessarily both) a model for the missingness mechanism or a model for the distribution of the complete data is correctly specified. Because with observational data one can never be sure that either a missingness model or a complete data model is correct, perhaps the best that can be hoped for is to find a DR estimator. DR estimators, in contrast to standard likelihood‐based or (nonaugmented) inverse probability‐weighted estimators, give the analyst two chances, instead of only one, to make a valid inference. In a causal inference model, an estimator is DR if it remains consistent when either a model for the treatment assignment mechanism or a model for the distribution of the counterfactual data is correctly specified. Because with observational data one can never be sure that a model for the treatment assignment mechanism or a model for the counterfactual data is correct, inference based on DR estimators should improve upon previous approaches. Indeed, we present the results of simulation studies which demonstrate that the finite sample performance of DR estimators is as impressive as theory would predict. The proposed method is applied to a cardiovascular clinical trial.

[1]  R. Pearl Biometrics , 1914, The American Naturalist.

[2]  D. Horvitz,et al.  A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .

[3]  J. Robins A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect , 1986 .

[4]  P. Rosenbaum Model-Based Direct Adjustment , 1987 .

[5]  J. Robins,et al.  Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .

[6]  J. Robins,et al.  Analysis of semiparametric regression models for repeated outcomes in the presence of missing data , 1995 .

[7]  J. Robins,et al.  Toward a curse of dimensionality appropriate (CODA) asymptotic theory for semi-parametric models. , 1997, Statistics in medicine.

[8]  James M. Robins,et al.  Semiparametric Regression for Repeated Outcomes With Nonignorable Nonresponse , 1998 .

[9]  Joseph G. Ibrahim,et al.  A Weighted Estimating Equation for Missing Covariate Data with Properties Similar to Maximum Likelihood , 1999 .

[10]  J. Robins,et al.  Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models , 1999 .

[11]  James M. Robins,et al.  Causal Inference Without Counterfactuals: Comment , 2000 .

[12]  James M. Robins,et al.  On Profile Likelihood: Comment , 2000 .

[13]  James M. Robins,et al.  Marginal Structural Models versus Structural nested Models as Tools for Causal inference , 2000 .

[14]  J. Robins,et al.  Marginal Structural Models to Estimate the Joint Causal Effect of Nonrandomized Treatments , 2001 .

[15]  M Bonetti,et al.  Discussion of the Frangakis and Rubin article. , 2001, Biometrics.

[16]  Donald B. Rubin,et al.  Addressing an Idiosyncrasy in Estimating Survival Curves Using Double Sampling in the Presence of Self‐Selected Right Censoring , 2001 .

[17]  D B Rubin,et al.  Rejoinder to Discussions on Addressing an Idiosyncrasy in Estimating Survival Curves Using Double Sampling in the Presence of Self‐Selected Right Censoring , 2001, Biometrics.

[18]  P. Lavori,et al.  Using inverse weighting and predictive inference to estimate the effects of time‐varying treatments on the discrete‐time hazard , 2002, Statistics in medicine.

[19]  James M. Robins,et al.  Commentary on ‘Using inverse weighting and predictive inference to estimate the effects of time‐varying treatments on the discrete‐time hazard’ , 2002 .

[20]  N. Schneiderman,et al.  Effects of treating depression and low perceived social support on clinical events after myocardial infarction: the Enhancing Recovery in Coronary Heart Disease Patients (ENRICHD) Randomized Trial. , 2003, JAMA.

[21]  James M. Robins,et al.  Unified Methods for Censored Longitudinal Data and Causality , 2003 .

[22]  J. Lunceford,et al.  Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study , 2004, Statistics in medicine.

[23]  Mark J. van der Laan,et al.  Why prefer double robust estimators in causal inference , 2005 .