Mode-seeking by Medoidshifts

We present a nonparametric mode-seeking algorithm, called medoidshift, based on approximating the local gradient using a weighted estimate of medoids. Like meanshift, medoidshift clustering automatically computes the number of clusters and the data does not have to be linearly separable. Unlike meanshift, the proposed algorithm does not require the definition of a mean. This property allows medoidshift to find modes even when only a distance measure between samples is defined. In this sense, the relationship between the medoidshift algorithm and the meanshift algorithm is similar to the relationship between the k-medoids and the k-means algorithms. We show that medoidshifts can also be used for incremental clustering of growing datasets by recycling previous computations. We present experimental results using medoidshift for image segmentation, incremental clustering for shot segmentation and clustering on nonlinearly separable data.

[1]  Don Coppersmith,et al.  Matrix multiplication via arithmetic progressions , 1987, STOC.

[2]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[3]  Yizong Cheng,et al.  Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Rajeev Motwani,et al.  Incremental clustering and dynamic information retrieval , 1997, STOC '97.

[5]  Jiawei Han,et al.  CLARANS: A Method for Clustering Objects for Spatial Data Mining , 2002, IEEE Trans. Knowl. Data Eng..

[6]  Peter J. Rousseeuw,et al.  Clustering by means of medoids , 1987 .

[7]  Keinosuke Fukunaga,et al.  A Graph-Theoretic Approach to Nonparametric Cluster Analysis , 1976, IEEE Transactions on Computers.

[8]  Mohamed S. Kamel,et al.  Incremental document clustering using cluster similarity histograms , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).

[9]  Matthew P. Wand,et al.  Kernel Smoothing , 1995 .

[10]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[11]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[12]  Peter Meer,et al.  Nonlinear Mean Shift for Clustering over Analytic Manifolds , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  Ada Wai-Chee Fu,et al.  Incremental Document Clustering for Web Page Classification , 2002 .

[14]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Oren Etzioni,et al.  Web document clustering: a feasibility demonstration , 1998, SIGIR '98.

[16]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Dorin Comaniciu,et al.  Real-time tracking of non-rigid objects using mean shift , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[18]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[19]  B. Moore,et al.  ART1 and pattern clustering , 1989 .

[20]  Miguel Á. Carreira-Perpiñán,et al.  Mode-Finding for Mixtures of Gaussian Distributions , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Stephen Grossberg,et al.  A massively parallel architecture for a self-organizing neural pattern recognition machine , 1988, Comput. Vis. Graph. Image Process..

[22]  Larry D. Hostetler,et al.  The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.