A Combination Approach to Cluster Validation Based on Statistical Quantiles

In this paper, we analyse different techniques to detect the number of clusters in a dataset, also know as cluster validation techniques. We also propose a new algorithm based on the combination of several validation indexes to simultaneously validate several partitions of a dataset generated by different clustering techniques and object distances. The existing validation techniques as well as the combination algorithm have been tested on three data sets: a synthesized mixture of Gaussians data set, the NCI60 microarray data set, and the Iris data set. Evaluation results have shown the adequate performance of the proposed approach, even if the input validity scores fail to discover the true number of clusters.