论文信息 - Improved K-means clustering algorithm based on the optimized initial centriods

Improved K-means clustering algorithm based on the optimized initial centriods

K-means clustering algorithm is one of the most widely used clustering algorithms and has been applied in many fields of science and technology. A major problem of the k-means clustering algorithm is that the results in different types of clusters depending on the initial centroid which choose at random. At the same time, many feature values are taked into consideration, it leads to severe degradation in the performance. In this paper, an improved k-means clustering algorithm with variance is proposed. It selects the initial centriods using the Huffman tree structure. In order to solve the high-dimensional problem, principal component analysis based on variance is adopted. The experimental results confirm that the proposed algorithm is an efficient algorithm with better clustering accuracy and very less execution time.

Shunye Wang | Shunye Wang

[1] M. P. Sebastian,et al. Improving the Accuracy and Efficiency of the k-means Clustering Algorithm , 2009 .

[2] M. Punithavalli,et al. A Modified Projected K-Means Clustering Algorithm with Effective Distance Measure , 2012 .

[3] Duan Fu. Improved k-means algorithm with meliorated initial centers , 2013 .

[4] D.M. Mount,et al. An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[5] Chen Zhou,et al. Improved K -means algorithm and its implementation based on density: Improved K -means algorithm and its implementation based on density , 2011 .

[6] Weixin Xie,et al. An Efficient Global K-means Clustering Algorithm , 2011, J. Comput..

[7] D. K. Ghosh,et al. K-means Clustering Algorithm Characteristics Differences based on Distance Measurement , 2012 .

[8] Stephen J. Redmond,et al. A method for initialising the K-means clustering algorithm using kd-trees , 2007, Pattern Recognit. Lett..

[9] Zhou Chen. Improved K-means algorithm and its implementation based on density , 2011 .

[10] Malay K. Pakhira,et al. A Modified k-means Algorithm to Avoid Empty Clusters , 2009 .

[11] Daniel T. Larose,et al. Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[12] Hao Zhi-qiang. Improved k-means initial clustering center selection algorithm , 2010 .