Efficient and Fast Initialization Algorithm for K- means Clustering

The famous K-means clustering algorithm is sensitive to the selection of the initial centroids and may converge to a local minimum of the criterion function value. A new algorithm for initialization of the K-means clustering algorithm is presented. The proposed initial starting centroids procedure allows the K-means algorithm to converge to a "better" local minimum. Our algorithm shows that refined initial starting centroids indeed lead to improved solutions. A framework for implementing and testing various clustering algorithms is presented and used for developing and evaluating the algorithm.

[1]  Paul S. Bradley,et al.  Clustering via Concave Minimization , 1996, NIPS.

[2]  Taher Niknam,et al.  A New Evolutionary Algorithm for Cluster Analysis , 2008 .

[3]  Jianhong Wu,et al.  Data clustering - theory, algorithms, and applications , 2007 .

[4]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[5]  Bassam Hammo,et al.  New Efficient Strategy to Accelerate k-Means Clustering Algorithm , 2008 .

[6]  MAGDALINI EIRINAKI,et al.  Web mining for web personalization , 2003, TOIT.

[7]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[8]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[9]  Pedro Larrañaga,et al.  An empirical comparison of four initialization methods for the K-Means algorithm , 1999, Pattern Recognit. Lett..

[10]  Shehroz S. Khan,et al.  Cluster center initialization algorithm for K-means clustering , 2004, Pattern Recognit. Lett..

[11]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[12]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[13]  d Belal. Al-Daoud A New Algorithm for Cluster Initialization Moth ’ , 2005 .

[14]  Paul S. Bradley,et al.  Refining Initial Points for K-Means Clustering , 1998, ICML.

[15]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[16]  Anil K. Jain,et al.  Texture classification and segmentation using multiresolution simultaneous autoregressive models , 1992, Pattern Recognit..

[17]  M. Emre Celebi,et al.  Effective initialization of k-means for color quantization , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[18]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.