Favoring the k-Means Algorithm with Initialization Methods

Clustering algorithms are non-supervised algorithms and, among the many available, the k-Means can be considered one of the most popular and successful. The performance of the k-Means, however, is highly dependent on a ‘good’ initialization of the k group centers (centroids) as well as of the value assigned to the number (k) of groups the final clustering should have. This chapter addresses experiments using five initialization algorithms available in the literature namely, the Method1, the k-Means++, the CCIA, the Maedeh&Suresh and the SPSS algorithms, to empirically evaluate their contribution to improving k-Means performance.