Multiple imputation versus data enhancement for dealing with missing data in observational health care outcome analyses.

The problem of missing data is frequently encountered in observational studies. We compared approaches to dealing with missing data. Three multiple imputation methods were compared with a method of enhancing a clinical database through merging with administrative data. The clinical database used for comparison contained information collected from 6,065 cardiac care patients in 1995 in the province of Alberta, Canada. The effectiveness of the different strategies was evaluated using measures of discrimination and goodness of fit for the 1995 data. The strategies were further evaluated by examining how well the models predicted outcomes in data collected from patients in 1996. In general, the different methods produced similar results, with one of the multiple imputation methods demonstrating a slight advantage. It is concluded that the choice of missing data strategy should be guided by statistical expertise and data resources.

[1]  R H Jones,et al.  Determinants of early versus late cardiac death in patients undergoing coronary artery bypass graft surgery. , 1991, Circulation.

[2]  S. van Buuren,et al.  Flexible mutlivariate imputation by MICE , 1999 .

[3]  Joseph L Schafer,et al.  Analysis of Incomplete Multivariate Data , 1997 .

[4]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data , 1988 .

[5]  P. McCullagh,et al.  Generalized Linear Models, 2nd Edn. , 1990 .

[6]  W Vach,et al.  Biased estimation of the odds ratio in case-control studies due to the use of ad hoc methods of correcting for missing values for confounding variables. , 1991, American journal of epidemiology.

[7]  D. Rubin Multiple imputation for nonresponse in surveys , 1989 .

[8]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[9]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[10]  Roderick J. A. Little Regression with Missing X's: A Review , 1992 .

[11]  C D Naylor,et al.  Dealing with missing data in observational health care outcome analyses. , 2000, Journal of clinical epidemiology.

[12]  Werner Vach,et al.  Logistic Regression with Missing Values in the Covariates , 1994 .

[13]  S Greenland,et al.  A critical look at methods for handling missing covariates in epidemiologic regression analyses. , 1995, American journal of epidemiology.

[14]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[15]  J. Tu,et al.  Coronary Artery Bypass Mortality Rates in Ontario , 1996 .

[16]  E L Hannan,et al.  Improving the outcomes of coronary artery bypass surgery in New York State. , 1994, JAMA.

[17]  W. Ghali,et al.  Overview of the Alberta Provincial Project for Outcome Assessment in Coronary Heart Disease. On behalf of the APPROACH investigators. , 2000, The Canadian journal of cardiology.