Geodesic K-means clustering

We introduce a class of geodesic distances and extend the K-means clustering algorithm to employ this distance metric. Empirically, we demonstrate that our geodesic K-means algorithm exhibits several desirable characteristics missing in the classical K-means. These include adjusting to varying densities of clusters, high levels of resistance to outliers, and handling clusters that are not linearly separable. Furthermore our comparative experiments show that geodesic K-means comes very close to competing with state-of-the-art algorithms such as spectral and hierarchical clustering.

[1]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[2]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[3]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[4]  Joseph L. Zinnes,et al.  Theory and Methods of Scaling. , 1958 .

[5]  Trevor Hastie,et al.  The elements of statistical learning. 2001 , 2001 .

[6]  Seungjin Choi,et al.  Soft Geodesic Kernel K-Means , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[7]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[8]  Jitendra Malik,et al.  Contour and Texture Analysis for Image Segmentation , 2001, International Journal of Computer Vision.

[9]  János Abonyi,et al.  Geodesic Distance Based Fuzzy Clustering , 2007 .

[10]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[11]  C. Quesenberry,et al.  A nonparametric estimate of a multivariate density function , 1965 .

[12]  A. Izenman Recent Developments in Nonparametric Density Estimation , 1991 .

[13]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[14]  I. Borg Multidimensional similarity structure analysis , 1987 .