论文信息 - A New Algorithm for Optimization of K-Means Clustering with Determining Maximum Distance Between Centroids

A New Algorithm for Optimization of K-Means Clustering with Determining Maximum Distance Between Centroids

K-means algorithm is very sensitive in initial starting points. Because of initial starting points generated randomly, K-means does not guarantee the unique clustering results so that it is very difficult to reach global optimum. A new algorithm for optimization of K- means clustering is proposed in this paper. It determines position of initial centroids in farthest accumulated distance among them. The accumulated distance metric is built at first in order to designate the initial centroids. A new initial centroid can be selected from a data which has maximum accumulated distance metric. The iterative process is needed so that the all initial centroids are determined. The new approach proposed in this paper can positionate all centroids far separately among them in the data distribution. The experimental results show effectiveness of the proposed algorithm to improve the clustering results of K-means clustering.

Ali Ridho Barakbah

[1] Ali Ridho Barakbah,et al. Identifying moving variance to make automatic clustering for normal data set , 2004 .

[2] Yiu-ming Cheung,et al. k*-Means: A new generalized k-means clustering algorithm , 2003, Pattern Recognit. Lett..

[3] H. Ralambondrainy,et al. A conceptual version of the K-means algorithm , 1995, Pattern Recognit. Lett..

[4] Paul S. Bradley,et al. Refining Initial Points for K-Means Clustering , 1998, ICML.

[5] Cor J. Veenman,et al. A Maximum Variance Cluster Algorithm , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[6] Samir Saoudi,et al. Stochastic K-means algorithm for vector quantization , 2001, Pattern Recognit. Lett..

[7] Ali Ridho Barakbah,et al. Optimized K-means: an algorithm of initial centroids optimization for K-means , 2005 .

[8] Pedro Larrañaga,et al. An empirical comparison of four initialization methods for the K-Means algorithm , 1999, Pattern Recognit. Lett..

[9] Shehroz S. Khan,et al. Cluster center initialization algorithm for K-means clustering , 2004, Pattern Recognit. Lett..

[10] Siddheswar Ray,et al. Determination of Number of Clusters in K-Means Clustering and Application in Colour Image Segmentation , 2000 .

[11] Vladimir Estivill-Castro,et al. Why so many clustering algorithms: a position paper , 2002, SKDD.

[12] Yi-tzuu T. Chien. Interactive pattern recognition , 1978 .