An Improved Initialization Center Algorithm for K-Means Clustering

The traditional k-means algorithm has sensitivity to the initial start center. To solve this problem, this paper proposed a new method to find the initial center and improve the sensitivity to the initial centers of k-means algorithm. The algorithm first computes the density of the area where the data object belongs to; then it finds k data objects, which are belong to high density area, as the initial start centers. Experiments based on the standard database UCI show that the proposed method can produce a high purity clustering results and eliminate the sensitivity to the initial centers to some extent.

[1]  Leonid Portnoy,et al.  Intrusion detection with unlabeled data using clustering , 2000 .

[2]  Stephen J. Redmond,et al.  A method for initialising the K-means clustering algorithm using kd-trees , 2007, Pattern Recognit. Lett..

[3]  C.-C. Jay Kuo,et al.  A new initialization technique for generalized Lloyd iteration , 1994, IEEE Signal Processing Letters.

[4]  Lei Xu,et al.  Rival penalized competitive learning , 2007, Scholarpedia.

[5]  Siddheswar Ray,et al.  Determination of Number of Clusters in K-Means Clustering and Application in Colour Image Segmentation , 2000 .

[6]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[7]  Michael J. Brusco,et al.  Initializing K-means Batch Clustering: A Critical Evaluation of Several Techniques , 2007, J. Classif..

[8]  Shehroz S. Khan,et al.  Cluster center initialization algorithm for K-means clustering , 2004, Pattern Recognit. Lett..

[9]  R. W. Harris,et al.  A comparison of several vector quantization codebook generation approaches , 1993, IEEE Trans. Image Process..

[10]  Nikos A. Vlassis,et al.  The global k-means clustering algorithm , 2003, Pattern Recognit..

[11]  M. Narasimha Murty,et al.  A near-optimal initial seed value selection in K-means means algorithm using a genetic algorithm , 1993, Pattern Recognit. Lett..

[12]  Paul S. Bradley,et al.  Refining Initial Points for K-Means Clustering , 1998, ICML.

[13]  Bo Thiesson,et al.  Learning Mixtures of Bayesian Networks , 1997, UAI 1997.

[14]  Michael J. Laszlo,et al.  A genetic algorithm that exchanges neighboring centers for k-means clustering , 2007, Pattern Recognit. Lett..