论文信息 - K-means clustering algorithm based on coefficient of variation

K-means clustering algorithm based on coefficient of variation

The performance of k-means clustering algorithm depends on the selection of distance metrics. The Euclid distance is commonly chosen as the similarity measure in k-means clustering algorithm, which treats all features equally and does not accurately reflect the similarity among samples. K-means clustering algorithm based on coefficient of variation (CV-k-means) is proposed in this paper to solve this problem. The CV-k-means clustering algorithm uses variation coefficient weight vector to decrease the affects of irrelevant features. The experimental results show that the proposed algorithm can generate better clustering results than k-means algorithm do.

Shuhua Ren | Alin Fan | Shuhua Ren | Alin Fan

[1] C. A. Murthy,et al. Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[2] Pierre Hansen,et al. J-MEANS: a new local search heuristic for minimum sum of squares clustering , 1999, Pattern Recognit..

[3] Huan Liu,et al. Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[4] A. Raftery,et al. Variable Selection for Model-Based Clustering , 2006 .

[5] Michael K. Ng,et al. Automated variable weighting in k-means type clustering , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6] Wensheng Yin,et al. Weighted k-Means Algorithm Based Text Clustering , 2009, 2009 International Symposium on Information Engineering and Electronic Commerce.

[7] Petra Perner,et al. Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[8] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[9] Coefficient of Variation and Its Application to Strength Prediction of Adhesively Bonded Joints , 2009, 2009 International Conference on Measuring Technology and Mechatronics Automation.

[10] Wang Xi. Optimization of K-means Clustering by Feature Weight Learning , 2003 .

[11] Anil K. Jain,et al. Simultaneous feature selection and clustering using mixture models , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12] Andrew W. Moore,et al. X-means: Extending K-means with Efficient Estimation of the Number of Clusters , 2000, ICML.

[13] Rui Xu,et al. Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.