An examination of procedures for determining the number of clusters in a data set

A problem common to all clustering techniques is the difficulty of deciding the number of clusters present in the data. The aim of this paper is to compare three methods based on the hypervolume criterion with four other well-known methods. This evaluation of procedures for determining the number of clusters is conducted on artificial data sets. To provide a variety of solutions the data sets are analysed by six clustering methods. We finally conclude by pointing out the performance of each method and by giving some guidance for making choices between them.