Clustering is the process of organizing data objects into a set of disjoint classes called clusters. Clustering is an example of unsupervised classification. Cluster analysis seeks to partition a given data set into groups based on specified features so that the data points within a group are more similar to each other than the points in different groups Cluster analysis is one of the primary data analysis methods and k-means is one of the most well known popular clustering algorithms. The k-means algorithm is one of the frequently used clustering methods in data mining, due to its performance in clustering massive data sets. The final clustering result of the k- means clustering algorithm greatly depends upon the correctness of the initial centroids, which are selected randomly. A new method is proposed for finding the better initial centroids and to provide an efficient way of assigning the data points to suitable clusters with reduced time complexity. The proposed algorithm has the more accuracy with less computational time comparatively original k-means clustering algorithm.
[1]
Chen Zhang,et al.
K-means Clustering Algorithm with Improved Initial Center
,
2009,
2009 Second International Workshop on Knowledge Discovery and Data Mining.
[2]
M. P. Sebastian,et al.
Improving the Accuracy and Efficiency of the k-means Clustering Algorithm
,
2009
.
[3]
Ali Ridho Barakbah,et al.
Hierarchical K-means: an algorithm for centroids initialization for K-means
,
2007
.
[4]
Fang Yuan,et al.
A new algorithm to get the initial centroids
,
2004,
Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826).
[5]
Abdel-Badeeh M. Salem,et al.
An efficient enhanced k-means clustering algorithm
,
2006
.
[6]
Anindya Bhattacharya,et al.
Divisive Correlation Clustering Algorithm (DCCA) for grouping of genes: detecting varying patterns in expression profiles
,
2008,
Bioinform..