Clustering of Variables with Missing Data: Application to Preference Studies

Clustering of variables around latent components is a means of organizing multivariate data into meaningful subgroups. We extend the approach to situations with missing data. A straightforward method is to replace the missing values by some estimates and cluster the completed data set. This basic imputation method is improved by more sophisticated procedures which update the imputations within each group after an initial clustering of the variables. We compare the performance of the different imputation methods with the help of a simulation study.