MIDAS: A SAS Macro for Multiple Imputation Using Distance-Aided Selection of Donors

In this paper we describe MIDAS: a SAS macro for multiple imputation using distance aided selection of donors which implements an iterative predictive mean matching hot-deck for imputing missing data. This is a flexible multiple imputation approach that can handle data in a variety of formats: continuous, ordinal, and scaled. Because the imputation models are implicit, it is not necessary to specify a parametric distribution for each variable to be imputed. MIDAS also allows the user to address the sensitivity of their inferences to different assumptions concerning the missing data mechanism. An example using MIDAS to impute missing data is presented and MIDAS is compared to existing missing data software.

[1]  Donald Hedeker,et al.  On the performance of bias-reduction techniques for variance estimation in approximate Bayesian bootstrap imputation , 2007, Comput. Stat. Data Anal..

[2]  Joseph L Schafer,et al.  Analysis of Incomplete Multivariate Data , 1997 .

[3]  Joseph L Schafer,et al.  On the performance of random‐coefficient pattern‐mixture models for non‐ignorable drop‐out , 2003, Statistics in medicine.

[4]  Kenneth W. Wachter Hierarchical Logistic Regression Models for Imputation of Unresolved Enumeration Status in Undercount Estimation: Comment: Ignoring Nonignorable Effects , 1993 .

[5]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[6]  R. Little,et al.  Maximum likelihood estimation for mixed continuous and categorical data with missing values , 1985 .

[7]  R. Little Missing-Data Adjustments in Large Surveys , 1988 .

[8]  Juned Siddique,et al.  Using an Approximate Bayesian Bootstrap to multiply impute nonignorable missing data , 2008, Comput. Stat. Data Anal..

[9]  Thomas R Belin,et al.  Multiple imputation using an iterative hot‐deck with distance‐based donor selection , 2008, Statistics in medicine.

[10]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[11]  J. Schafer,et al.  Missing data: our view of the state of the art. , 2002, Psychological methods.

[12]  Xiao-Li Meng,et al.  Multiple-Imputation Inferences with Uncongenial Sources of Input , 1994 .

[13]  D B Rubin,et al.  Multiple imputation in health-care databases: an overview and some applications. , 1991, Statistics in medicine.

[14]  John Van Hoewyk,et al.  A multivariate technique for multiply imputing missing values using a sequence of regression models , 2001 .

[15]  D. Rubin,et al.  Hierarchical logistic regression models for imputation of unresolved enumeration status in undercount estimation. , 1993, Journal of the American Statistical Association.

[16]  R. Douglas Martin,et al.  S-PLUS Version 3 , 1992 .

[17]  D. Rubin,et al.  Handling “Don't Know” Survey Responses: The Case of the Slovenian Plebiscite , 1995 .

[18]  Andrew Gelman,et al.  Diagnostics for multivariate imputations , 2007 .

[19]  A. Rotnitzky,et al.  Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling and Sensitivity Analysis by DANIELS, M. J. and HOGAN, J. W , 2009 .

[20]  D. Rubin,et al.  Multiple Imputation for Interval Estimation from Simple Random Samples with Ignorable Nonresponse , 1986 .

[21]  D. Rubin Multiple imputation for nonresponse in surveys , 1989 .

[22]  Stef van Buuren,et al.  MICE: Multivariate Imputation by Chained Equations in R , 2011 .

[23]  J. Schafer,et al.  A comparison of inclusive and restrictive strategies in modern missing data procedures. , 2001, Psychological methods.

[24]  Jeremy MG Taylor,et al.  Partially parametric techniques for multiple imputation , 1996 .

[25]  D. Rubin,et al.  Ellipsoidally symmetric extensions of the general location model for mixed categorical and continuous data , 1998 .

[26]  Trevillore E. Raghunathan,et al.  IVEware: Imputation and Variance Estimation Software User Guide , 2002 .

[27]  Patrick Royston,et al.  Multiple Imputation of Missing Values: Update of Ice , 2005 .

[28]  Ken P Kleinman,et al.  Much Ado About Nothing , 2007, The American statistician.

[29]  Ker-Chau Li,et al.  Regression Analysis Under Link Violation , 1989 .