Issues in Multivariate Cluster Analysis

Using a Monte Carlo simulation, the research in this article addresses two key questions about the accuracy of cluster analysis in reproducing a known true cluster model. First, how is the accuracy affected by different ways of measuring interunit similarity; in this case, different ways of using principal components analysis. Second, how is the accuracy affected by the quality of the characteristics data and by different procedures for handling missing information? The results indicate that using principal components analysis is superior to not using it and that the choice of how to utilize the principal components results may be critical. The results also indicate that the impact of data quality differences may be minimal, but that there are important differentials among the procedures for handling missing data.