Agglomerative Mean-Shift Clustering

Mean-Shift (MS) is a powerful nonparametric clustering method. Although good accuracy can be achieved, its computational cost is particularly expensive even on moderate data sets. In this paper, for the purpose of algorithmic speedup, we develop an agglomerative MS clustering method along with its performance analysis. Our method, namely Agglo-MS, is built upon an iterative query set compression mechanism which is motivated by the quadratic bounding optimization nature of MS algorithm. The whole framework can be efficiently implemented in linear running time complexity. We then extend Agglo-MS into an incremental version which performs comparably to its batch counterpart. The efficiency and accuracy of Agglo-MS are demonstrated by extensive comparing experiments on synthetic and real data sets.

[1]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[2]  Jiawei Han,et al.  Spectral Regression for Efficient Regularized Subspace Learning , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[3]  Miguel Á. Carreira-Perpiñán,et al.  Manifold blurring mean shift algorithms for manifold denoising , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[5]  Takeo Kanade,et al.  Mode-seeking by Medoidshifts , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[6]  Miguel Á. Carreira-Perpiñán,et al.  Continuous latent variable models for dimensionality reduction and sequential data reconstruction , 2001 .

[7]  Larry D. Hostetler,et al.  The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.

[8]  Yves Goussard,et al.  On global and local convergence of half-quadratic algorithms , 2006, IEEE Transactions on Image Processing.

[9]  Larry S. Davis,et al.  Improved fast gauss transform and efficient kernel density estimation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[10]  Fei Wang,et al.  Fast Multilevel Transduction on Graphs , 2007, SDM.

[11]  Fei Wang,et al.  Gene Selection via Matrix Factorization , 2007, 2007 IEEE 7th International Symposium on BioInformatics and BioEngineering.

[12]  Stefano Soatto,et al.  Quick Shift and Kernel Methods for Mode Seeking , 2008, ECCV.

[13]  George Karypis,et al.  Hierarchical Clustering Algorithms for Document Datasets , 2005, Data Mining and Knowledge Discovery.

[14]  Stan Z. Li,et al.  Half Quadratic Analysis for Mean Shift: with Extension to A Sequential Data Mode-Seeking Method , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[15]  A. de Medeiros Martins,et al.  Information Theoretic Mean Shift Algorithm , 2006, 2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing.

[16]  R. Vidal,et al.  Intrinsic mean shift for clustering on Stiefel and Grassmann manifolds , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Stan Z. Li,et al.  Stochastic gradient kernel density mode-seeking , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  T. Postelnicu,et al.  A “Natural” Agglomerative Clustering Method for Biology , 1991 .

[19]  B. Walter,et al.  Fast agglomerative clustering for rendering , 2008, 2008 IEEE Symposium on Interactive Ray Tracing.

[20]  Peter Meer,et al.  Simultaneous multiple 3D motion estimation via mode finding on Lie groups , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[21]  Dorin Comaniciu,et al.  A common framework for nonlinear diffusion, adaptive smoothing, bilateral filtering and mean shift , 2004, Image Vis. Comput..

[22]  Peter Meer,et al.  Nonlinear Mean Shift for Clustering over Analytic Manifolds , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[23]  Alexander G. Gray,et al.  Fast Mean Shift with Accurate and Stable Convergence , 2007, AISTATS.

[24]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.

[25]  Ming Tang,et al.  Accelerated Convergence Using Dynamic Mean Shift , 2006, ECCV.

[26]  Raymond J. Mooney,et al.  Integrating constraints and metric learning in semi-supervised clustering , 2004, ICML.

[27]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[28]  Miguel Á. Carreira-Perpiñán,et al.  Mode-Finding for Mixtures of Gaussian Distributions , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Andrew W. Moore,et al.  'N-Body' Problems in Statistical Learning , 2000, NIPS.

[30]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Dorin Comaniciu,et al.  Real-time tracking of non-rigid objects using mean shift , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[32]  Carlo Tomasi,et al.  Mean shift is a bound optimization , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Ilan Shimshoni,et al.  Mean shift based clustering in high dimensions: a texture classification example , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[34]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[35]  S. S. Ravi,et al.  Using instance-level constraints in agglomerative hierarchical clustering: theoretical and empirical results , 2009, Data Mining and Knowledge Discovery.

[36]  Anton van den Hengel,et al.  Fast Global Kernel Density Mode Seeking: Applications to Localization and Tracking , 2007, IEEE Transactions on Image Processing.

[37]  Chih-Jen Lin,et al.  IJCNN 2001 challenge: generalization ability and text decoding , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[38]  Yizong Cheng,et al.  Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[40]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[41]  Robert T. Collins,et al.  Mean-shift blob tracking through scale space , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[42]  Miguel Á. Carreira-Perpiñán,et al.  Fast nonparametric clustering with Gaussian blurring mean-shift , 2006, ICML.

[43]  Frédo Durand,et al.  A Topological Approach to Hierarchical Segmentation using Mean Shift , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Miguel Á. Carreira-Perpiñán,et al.  Gaussian Mean-Shift Is an EM Algorithm , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Yuxiao Hu,et al.  Face recognition using Laplacianfaces , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  D. Freedman,et al.  Fast Mean Shift by compact density representation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Miguel Á. Carreira-Perpiñán Acceleration Strategies for Gaussian Mean-Shift Image Segmentation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).