Survival estimation and testing via multiple imputation

Multiple imputation is a technique for handling data sets with missing values. The method fills in each missing value several times, creating many augmented data sets. Each augmented data set is analyzed separately and the results combined to give a final result consisting of an estimate and a measure of uncertainty. In this paper we consider nonparametric multiple-imputation methods to handle missing event times for censored observations in the context of nonparametric survival estimation and testing. Two nonparametric imputation schemes are considered. In risk set imputation the censored time is replaced by a random draw of the observed times amongst those at risk after the censoring time. In Kaplan-Meier (KM) imputation the imputed time is a draw from the estimated distribution of event times amongst those at risk after the censoring time. We show that with a large number of imputes the estimates from both methods reproduce the KM estimator. In a simulation study we show that the inclusion of a bootstrap stage in the multiple imputation algorithm gives coverage rates of confidence intervals that are comparable to that from Greenwood's formula. Connections to the redistribute to the right algorithm are discussed.