A Potential for Bias When Rounding in Multiple Imputation

With the advent of general purpose packages that support multiple imputation for analyzing datasets with missing data (e.g., Solas, SAS PROC MI, and S-Plus 6.0), we expect much greater use of multiple imputation in the future. For simplicity, some imputation packages assume the joint distribution of the variables in the multiple imputation model is multivariate normal, and impute the missing data from the conditional normal distribution for the missing data given the observed data. If the possibly missing data are not multivariate normal (say, binary), imputing a normal random variable can yield implausible values. To circumvent this problem, a number of methods have been developed, including rounding the imputed normal to the closest observed value in the dataset. We show that this rounding can cause biased estimates of parameters, whereas if the imputed value is not rounded, no bias would occur. This article shows that rounding should not be used indiscriminately, and thus some caution should be exercised when rounding imputed values, particularly for dichotomous variables.