Hierarchical K-means: an algorithm for centroids initialization for K-means

Initial starting points those generated randomly by K-means often make the clustering results reaching the local optima. The better results of K-means clustering can be achieved after computing more than one times. However, it is difficult to decide the computation limit, which can give the better result. In this paper, we propose a new approach to optimize the initial centroids for K-means. It utilizes all the clustering results of K-means in certain times, even though some of them reach the local optima. Then, we transform the result by combining with Hierarchical algorithm in order to determine the initial centroids for K-means. The experimental results show how effective the proposed method to improve the clustering results by K-means.

[1]  Samir Saoudi,et al.  Stochastic K-means algorithm for vector quantization , 2001, Pattern Recognit. Lett..

[2]  Ronald K. Pearson,et al.  Quantitative Evaluation of Clustering Results Using Computational Negative Controls , 2004 .

[3]  T Watson Layne,et al.  A Genetic Algorithm Approach to Cluster Analysis , 1998 .

[4]  Ali Ridho Barakbah,et al.  Method for shape independent clustering in case of numerical clustering together with condensed clustering , 2004 .

[5]  Ujjwal Maulik,et al.  Genetic algorithm-based clustering technique , 2000, Pattern Recognit..

[6]  Ali Ridho Barakbah,et al.  Identifying moving variance to make automatic clustering for normal data set , 2004 .

[7]  H. Ralambondrainy,et al.  A conceptual version of the K-means algorithm , 1995, Pattern Recognit. Lett..

[8]  Lin-Yu Tseng,et al.  A genetic approach to the automatic clustering problem , 2001, Pattern Recognit..

[9]  Yi-tzuu T. Chien Interactive pattern recognition , 1978 .

[10]  Adrian E. Raftery,et al.  How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..

[11]  Yiu-ming Cheung,et al.  k*-Means: A new generalized k-means clustering algorithm , 2003, Pattern Recognit. Lett..

[12]  M. C. Ortiz,et al.  Selecting variables for k-means cluster analysis by using a genetic algorithm that optimises the silhouettes , 2004 .

[13]  Cor J. Veenman,et al.  A Maximum Variance Cluster Algorithm , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Michael K. Ng,et al.  A note on constrained k-means algorithms , 2000, Pattern Recognit..

[15]  Pedro Larrañaga,et al.  An empirical comparison of four initialization methods for the K-Means algorithm , 1999, Pattern Recognit. Lett..

[16]  Nikos A. Vlassis,et al.  The global k-means clustering algorithm , 2003, Pattern Recognit..

[17]  Shehroz S. Khan,et al.  Cluster center initialization algorithm for K-means clustering , 2004, Pattern Recognit. Lett..

[18]  Edward J. Simmons Interactive pattern recognition: a designer's tool , 1973, AFIPS National Computer Conference.

[19]  Zhengxin Chen,et al.  An iterative initial-points refinement algorithm for categorical data clustering , 2002, Pattern Recognit. Lett..

[20]  Siddheswar Ray,et al.  Determination of Number of Clusters in K-Means Clustering and Application in Colour Image Segmentation , 2000 .

[21]  Paul S. Bradley,et al.  Refining Initial Points for K-Means Clustering , 1998, ICML.

[22]  Vladimir Estivill-Castro,et al.  Why so many clustering algorithms: a position paper , 2002, SKDD.

[23]  Michalis Vazirgiannis,et al.  Clustering algorithms and validity measures , 2001, Proceedings Thirteenth International Conference on Scientific and Statistical Database Management. SSDBM 2001.

[24]  新井 康平 Learning processes of image clustering method with density maps derived from Self-Organizing Mapping (SOM) , 2004 .