Binary variable multiple‐model multiple imputation to address missing data mechanism uncertainty: application to a smoking cessation trial

The true missing data mechanism is never known in practice. We present a method for generating multiple imputations for binary variables, which formally incorporates missing data mechanism uncertainty. Imputations are generated from a distribution of imputation models rather than a single model, with the distribution reflecting subjective notions of missing data mechanism uncertainty. Parameter estimates and standard errors are obtained using rules for nested multiple imputation. Using simulation, we investigate the impact of missing data mechanism uncertainty on post-imputation inferences and show that incorporating this uncertainty can increase the coverage of parameter estimates. We apply our method to a longitudinal smoking cessation trial where nonignorably missing data were a concern. Our method provides a simple approach for formalizing subjective notions regarding nonresponse and can be implemented using existing imputation software.

[1]  T. Raghunathan,et al.  A Bayesian sensitivity model for intention‐to‐treat analysis on binary outcomes with dropouts , 2009, Statistics in medicine.

[2]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data: Little/Statistical Analysis with Missing Data , 2002 .

[3]  D. Rubin Formalizing Subjective Notions about the Effect of Nonrespondents in Sample Surveys , 1977 .

[4]  Bettina Gruen,et al.  Automatic generation of exams in R , 2009 .

[5]  Ian R White,et al.  Allowing for uncertainty due to missing data in meta‐analysis—Part 1: Two‐stage methods , 2008, Statistics in medicine.

[6]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data , 1988 .

[7]  Donald Hedeker,et al.  Analysis of binary outcomes with missing data: missing = smoking, last observation carried forward, and a little multiple imputation. , 2007, Addiction.

[8]  Susan M. Paddock,et al.  Subjective prior distributions for modeling longitudinal continuous outcomes with non‐ignorable dropout , 2009, Statistics in medicine.

[9]  Stef van Buuren,et al.  MICE: Multivariate Imputation by Chained Equations in R , 2011 .

[10]  James R Carpenter,et al.  Sensitivity analysis after multiple imputation under missing at random: a weighting approach , 2007, Statistical methods in medical research.

[11]  Ofer Harel,et al.  Inferences on missing information under multiple imputation and two-stage multiple imputation , 2007 .

[12]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[13]  Stef van Buuren,et al.  Flexible Imputation of Missing Data , 2012 .

[14]  A. Rotnitzky,et al.  Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling and Sensitivity Analysis by DANIELS, M. J. and HOGAN, J. W , 2009 .

[15]  Hakan Demirtas Simulation driven inferences for multiply imputed longitudinal datasets , 2004 .

[16]  J. Robins,et al.  Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models , 1999 .

[17]  T. Raghunathan,et al.  A Bayesian Approach for Clustered Longitudinal Ordinal Outcome With Nonignorable Missing Data , 2006 .

[18]  L. An,et al.  Why assigning ongoing tobacco use is not necessarily a conservative approach to handling missing tobacco cessation outcomes. , 2009, Nicotine & tobacco research : official journal of the Society for Research on Nicotine and Tobacco.

[19]  D. Hedeker,et al.  Effects of social support and relapse prevention training as adjuncts to a televised smoking-cessation intervention. , 1993, Journal of consulting and clinical psychology.

[20]  Xu Yan,et al.  Missing Data Handling Methods in Medical Device Clinical Trials , 2009, Journal of biopharmaceutical statistics.

[21]  Juned Siddique,et al.  Using an Approximate Bayesian Bootstrap to multiply impute nonignorable missing data , 2008, Comput. Stat. Data Anal..

[22]  H. Boshuizen,et al.  Multiple imputation of missing blood pressure covariates in survival analysis. , 1999, Statistics in medicine.

[23]  Roger A. Sugden,et al.  Multiple Imputation for Nonresponse in Surveys , 1988 .

[24]  Ofer Harel,et al.  Addressing Missing Data Mechanism Uncertainty using Multiple-Model Multiple Imputation: Application to a Longitudinal Clinical Trial. , 2012, The annals of applied statistics.

[25]  C. Crespi,et al.  Alternative approaches to assessing intervention effectiveness in randomized trials: application in a colorectal cancer screening study , 2011, Cancer Causes & Control.

[26]  J. Schafer,et al.  A comparison of inclusive and restrictive strategies in modern missing data procedures. , 2001, Psychological methods.

[27]  O. Harel,et al.  Inferences on the Outfluence – How do Missing Values Impact Your Analysis? , 2009 .

[28]  D. Rubin,et al.  Statistical Analysis with Missing Data , 1988 .

[29]  Daniel O Scharfstein,et al.  Incorporating prior beliefs about selection bias into the analysis of randomized trials with missing outcomes. , 2003, Biostatistics.

[30]  Hakan Demirtas,et al.  Multiple imputation under Bayesianly smoothed pattern‐mixture models for non‐ignorable drop‐out , 2005, Statistics in medicine.

[31]  Martin Hecht,et al.  Nested multiple imputation in large-scale assessments , 2014, Large-scale Assessments in Education.

[32]  I. White,et al.  Eliciting and using expert opinions about dropout bias in randomized controlled trials , 2007, Clinical trials.

[33]  Joseph L Schafer,et al.  On the performance of random‐coefficient pattern‐mixture models for non‐ignorable drop‐out , 2003, Statistics in medicine.