Review of Existing Methods for Finding Initial Clusters in K-means Algorithm

Clustering is one of the Data Mining tasks that can be used to cluster or group objects on the basis of their nearness to the central value. It has found many applications in the field of business, image processing, medical etc. K Means is one the method of clustering which is used widely because it is simple and efficient. The output of the K Means depends upon the chosen central values for clustering. So accuracy of the K Means algorithm depends much on the chosen central values. This paper presents the various methods evolved by researchers for finding initial clusters for K Means. General Terms Accuracy, Centroids, Complexity, Dataset, Initial Clusters, KMeans

[1]  Samarjeet Borah,et al.  Performance Analysis of AIM-K-means & K-means in Quality Cluster Generation , 2009, ArXiv.

[2]  Julius T. Tou,et al.  Pattern Recognition Principles , 1974 .

[3]  C.-C. Jay Kuo,et al.  A new initialization technique for generalized Lloyd iteration , 1994, IEEE Signal Processing Letters.

[4]  Shehroz S. Khan,et al.  Computation of Initial Modes for K-modes Clustering Algorithm Using Evidence Accumulation , 2007, IJCAI.

[5]  Shehroz S. Khan,et al.  Cluster center initialization algorithm for K-means clustering , 2004, Pattern Recognit. Lett..

[6]  M. P. Sebastian,et al.  Improving the Accuracy and Efficiency of the k-means Clustering Algorithm , 2009 .

[7]  Patricio A. Vela,et al.  A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm , 2012, Expert Syst. Appl..

[8]  Pedro Larrañaga,et al.  An empirical comparison of four initialization methods for the K-Means algorithm , 1999, Pattern Recognit. Lett..

[9]  A. Hussain,et al.  Hierarchical K-Means Algorithm Applied On Isolated Malay Digit Speech Recognit ion , 2012 .

[10]  Seiji Yamada,et al.  Careful Seeding Method based on Independent Components Analysis for k-means Clustering , 2012 .

[11]  Stephen J. Redmond,et al.  A method for initialising the K-means clustering algorithm using kd-trees , 2007, Pattern Recognit. Lett..

[12]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[13]  Madhu Yedla,et al.  Enhancing K-means Clustering Algorithm with Improved Initial Center , 2010 .

[14]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[15]  Fernando Bação,et al.  The self-organizing map, the Geo-SOM, and relevant variants for geosciences , 2005, Comput. Geosci..

[16]  Paul S. Bradley,et al.  Refining Initial Points for K-Means Clustering , 1998, ICML.