Effect of Distance Functions on K-Means Clustering Algorithm

Clustering analysis is the most significant step in data mining. This paper discusses the k-means clustering algorithm and various distance functions used in k-means clustering algorithm such as Euclidean distance function and Manhattan distance function. Experimental results are shown to observe the effect of Manhattan distance function and Euclidean distance function on k-means clustering algorithm. These results also show that distance functions furthermore affect the size of clusters formed by the k-means clustering algorithm.

[1]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[2]  Vipin Kumar,et al.  The Challenges of Clustering High Dimensional Data , 2004 .

[3]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..

[4]  Yin Shaohong,et al.  Research and improvement of clustering algorithm in data mining , 2010, 2010 2nd International Conference on Signal Processing Systems.

[5]  Tung-Shou Chen,et al.  Proceedings of 2005 International Symposium on Intelligent Signal Processing and Communication Systems a Combined K-means and Hierarchical Clustering Method for Improving the Clustering Efficiency of Microarray , 2022 .

[6]  A. Moore The case for approximate Distance Transforms , 2005 .

[7]  Guan Yong,et al.  Research on k-means Clustering Algorithm: An Improved k-means Clustering Algorithm , 2010, 2010 Third International Symposium on Intelligent Information Technology and Security Informatics.

[8]  Shuai Jiang,et al.  A Simple and Fast Algorithm for Global K-means Clustering , 2010, 2010 Second International Workshop on Education Technology and Computer Science.

[9]  Glenn Fung,et al.  A Comprehensive Overview of Basic Clustering Algorithms , 2001 .

[10]  Andrew A. Millward,et al.  A comparison of hierarchical and partitional clustering techniques for multispectral image classification , 2002, IEEE International Geoscience and Remote Sensing Symposium.

[11]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.