Analysis of binary outcomes with missing data: missing = smoking, last observation carried forward, and a little multiple imputation.

AIMS Analysis of binary outcomes with missing data is a challenging problem in substance abuse studies. We consider this problem in a simple two-group design where interest centers on comparing the groups in terms of the binary outcome at a single timepoint. DESIGN We describe how the deterministic assumptions of missing = smoking and last observation carried forward (LOCF) can be relaxed by allowing missingness to be related imperfectly to the binary outcome, either stratified on past values of the outcome or not. We also describe use of multiple imputation to take into account the uncertainty inherent in the imputed data. SETTING Data were analyzed from a published smoking cessation study evaluating the effectiveness of adding group-based treatment adjuncts to an intervention comprised of a television program and self-help materials. PARTICIPANTS Participants were 489 smokers who registered for the television-based program and who indicated an interest in attending group-based meetings. MEASUREMENTS The measurement of the smoking outcome was conducted via telephone interviews at post-intervention and at 24 months. FINDINGS AND CONCLUSIONS The significance of the group effect did vary as a function of the assumed relationship between missingness and smoking. The 'conservative' missing = smoking assumption suggested a beneficial group effect on smoking cessation, which was confirmed via a sensitivity analysis only if an extreme odds ratio of 5 between missingness and smoking was assumed. This type of sensitivity analysis is crucial in determining the role that missing data play in arriving at a study's conclusions.

[1]  David E. Booth,et al.  Analysis of Incomplete Multivariate Data , 2000, Technometrics.

[2]  Charla Nich,et al.  Intention-to-treat meets missing data: implications of alternate strategies for analyzing clinical trials data. , 2002, Drug and alcohol dependence.

[3]  Patricio Cumsille,et al.  Methods for Handling Missing Data , 2003 .

[4]  H. Stern,et al.  The use of multiple imputation for the analysis of missing data. , 2001, Psychological methods.

[5]  Hakan Demirtas,et al.  Multiple imputation under Bayesianly smoothed pattern‐mixture models for non‐ignorable drop‐out , 2005, Statistics in medicine.

[6]  J. Schafer,et al.  Missing data: our view of the state of the art. , 2002, Psychological methods.

[7]  C. Coffey,et al.  Modern statistical methods for handling missing repeated measurements in obesity trial data: beyond LOCF , 2003, Obesity reviews : an official journal of the International Association for the Study of Obesity.

[8]  R Hardy,et al.  Methods for handling missing data , 2009 .

[9]  D. Hedeker,et al.  Effects of social support and relapse prevention training as adjuncts to a televised smoking-cessation intervention. , 1993, Journal of consulting and clinical psychology.

[10]  R. Little,et al.  Statistical Techniques for Analyzing Data from Prevention Trials: Treatment of No-Shows Using Rubin's Causal Model , 1998 .

[11]  B. Flay,et al.  Examining the Effectiveness of a Community-Based Self-Help Program to Increase Women's Readiness for Smoking Cessation , 2001, American journal of community psychology.

[12]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data: Little/Statistical Analysis with Missing Data , 2002 .

[13]  D. Russell,et al.  Missing data: a review of current methods and applications in epidemiological research , 2004 .

[14]  J. Schafer,et al.  A comparison of inclusive and restrictive strategies in modern missing data procedures. , 2001, Psychological methods.

[15]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[16]  J. S. Long,et al.  Regression Models for Categorical and Limited Dependent Variables , 1997 .

[17]  A. Agresti Categorical data analysis , 1993 .

[18]  N M Laird,et al.  Missing data in longitudinal studies. , 1988, Statistics in medicine.

[19]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[20]  D. Hedeker,et al.  Extended telephone counseling for smoking cessation: does content matter? , 2003, Journal of consulting and clinical psychology.

[21]  Xiaowei Yang,et al.  Assessing missing data assumptions in longitudinal studies: an example using a smoking cessation trial. , 2005, Drug and alcohol dependence.

[22]  Richard J Cook,et al.  Marginal Analysis of Incomplete Longitudinal Binary Data: A Cautionary Note on LOCF Imputation , 2004, Biometrics.

[23]  D. Hedeker,et al.  Application of random-effects regression models in relapse research. , 1996, Addiction.

[24]  R Little,et al.  Intent-to-treat analysis for longitudinal studies with drop-outs. , 1996, Biometrics.

[25]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[26]  D. Hedeker,et al.  Statistical analysis of randomized trials in tobacco treatment: longitudinal designs with dichotomous outcome. , 2001, Nicotine & tobacco research : official journal of the Society for Research on Nicotine and Tobacco.

[27]  J. A. Calvin Regression Models for Categorical and Limited Dependent Variables , 1998 .

[28]  K. Delucchi Methods for the analysis of binary outcome results in the presence of missing data. , 1994, Journal of consulting and clinical psychology.

[29]  D. Hedeker,et al.  Analysis of longitudinal substance use outcomes using ordinal random-effects regression models. , 2000, Addiction.