Analysis of Initial Centers for k-Means Clustering Algorithm

Data Analysis plays an important role for understanding different events. Cluster Analysis is widely used data mining technique for knowledge discovery. Clustering has wide applications in the field of Artificial Intelligence, Pattern Matching, Image Segmentation, Compression, etc. Clustering is the process of finding the group of objects such that objects in one group will be similar to one another and different from the objects in the other group. k-Means clustering algorithm is one of the popular algorithm which has gained a lot of attraction because of its simplicity and ease of implementation. k-Means algorithm’s efficiency is limited because of random selection of k initial centers. Therefore, we have surveyed different approaches for initial centers selection for k-Means algorithm. We have also shown comparative analysis of Original K-Means and Data Clustering with Modified k-Means Algorithm using MATLAB R2009b. We chose Euclidean distance as the similarity measure for our implementation and results are evaluated. General Terms Data Mining, Clustering Algorithm, Objects.

[1]  M. P. S Bhatia,et al.  Data clustering with modified K-means algorithm , 2011, 2011 International Conference on Recent Trends in Information Technology (ICRTIT).

[2]  Ahamed B M Shafeeq,et al.  Dynamic Clustering of Data with Modified K-Means Algorithm , 2012 .

[3]  Saravanan,et al.  Performance analysis of k-means with different initialization methods for high dimensional data , 2010 .

[4]  M. P. Sebastian,et al.  Improving the Accuracy and Efficiency of the k-means Clustering Algorithm , 2009 .

[5]  Ali Ridho Barakbah,et al.  Hierarchical K-means: an algorithm for centroids initialization for K-means , 2007 .

[6]  Zhang Chun-ping,et al.  Research on K-means Clustering Algorithm , 2011 .

[7]  D. Napoleon,et al.  An efficient K-Means clustering algorithm for reducing time complexity using uniform distribution data points , 2010, Trendz in Information Sciences & Computing(TISC2010).

[8]  Paul S. Bradley,et al.  Refining Initial Points for K-Means Clustering , 1998, ICML.

[9]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[10]  Neha Aggarwal,et al.  A Mid - Point based k-mean Clustering Algorithm for Data mining , 2012 .

[11]  T. Velmurugan,et al.  A Survey of Partition based Clustering Algorithms in Data Mining: An Experimental Approach , 2011 .

[12]  D. Pham,et al.  Selection of K in K-means clustering , 2005 .

[13]  Barileé B. Baridam,et al.  More work on K -Means Clustering Algorithm: The Dimensionality Problem , 2012 .

[14]  Guan Yong,et al.  Research on k-means Clustering Algorithm: An Improved k-means Clustering Algorithm , 2010, 2010 Third International Symposium on Intelligent Information Technology and Security Informatics.