With the advent of general purpose packages that support multiple imputation for analyzing datasets with missing data (e.g., Solas, SAS PROC MI, and S-Plus 6.0), we expect much greater use of multiple imputation in the future. For simplicity, some imputation packages assume the joint distribution of the variables in the multiple imputation model is multivariate normal, and impute the missing data from the conditional normal distribution for the missing data given the observed data. If the possibly missing data are not multivariate normal (say, binary), imputing a normal random variable can yield implausible values. To circumvent this problem, a number of methods have been developed, including rounding the imputed normal to the closest observed value in the dataset. We show that this rounding can cause biased estimates of parameters, whereas if the imputed value is not rounded, no bias would occur. This article shows that rounding should not be used indiscriminately, and thus some caution should be exercised when rounding imputed values, particularly for dichotomous variables.
[1]
Roderick J. A. Little,et al.
Statistical Analysis with Missing Data
,
1988
.
[2]
D. Rubin,et al.
Multiple Imputation for Nonresponse in Surveys
,
1989
.
[3]
Joseph L Schafer,et al.
Analysis of Incomplete Multivariate Data
,
1997
.
[4]
T. Hesterberg,et al.
Analyzing data with missing values in S-PLUS
,
2001
.
[5]
H. Stern,et al.
The use of multiple imputation for the analysis of missing data.
,
2001,
Psychological methods.
[6]
D. Rubin,et al.
Multiple Imputation for Interval Estimation from Simple Random Samples with Ignorable Nonresponse
,
1986
.
[7]
D. Rubin,et al.
MULTIPLE IMPUTATIONS IN SAMPLE SURVEYS-A PHENOMENOLOGICAL BAYESIAN APPROACH TO NONRESPONSE
,
2002
.