Fast affinity propagation clustering: A multilevel approach

In this paper, we propose a novel Fast Affinity Propagation clustering approach (FAP). FAP simultaneously considers both local and global structure information contained in datasets, and is a high-quality multilevel graph partitioning method that can implement both vector-based and graph-based clustering. First, a new Fast Sampling algorithm (FS) is proposed to coarsen the input sparse graph and choose a small number of final representative exemplars. Then a density-weighted spectral clustering method is presented to partition those exemplars on the global underlying structure of data manifold. Finally, the cluster assignments of all data points can be achieved through their corresponding representative exemplars. Experimental results on two synthetic datasets and many real-world datasets show that our algorithm outperforms the state-of-the-art original affinity propagation and spectral clustering algorithms in terms of speed, memory usage, and quality on both vector-based and graph-based clustering.

[1]  John A. Turner,et al.  Sustainable Hydrogen Production , 2004, Science.

[2]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[3]  Xian-Sheng Hua,et al.  Finding image exemplars using fast sparse affinity propagation , 2008, ACM Multimedia.

[4]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[5]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Andrew B. Kahng,et al.  New spectral methods for ratio cut partitioning and clustering , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[7]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[8]  Pietro Perona,et al.  Self-Tuning Spectral Clustering , 2004, NIPS.

[9]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[10]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[11]  Inderjit S. Dhillon,et al.  Weighted Graph Cuts without Eigenvectors A Multilevel Approach , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Graham J. Williams,et al.  Data Mining , 2000, Communications in Computer and Information Science.

[13]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[14]  Fei Wang,et al.  Fast Multilevel Transduction on Graphs , 2007, SDM.

[15]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[16]  Yi Ma,et al.  A new GPCA algorithm for clustering subspaces by fitting, differentiating and dividing polynomials , 2004, CVPR 2004.

[17]  James T. Kwok,et al.  Density-Weighted Nystrm Method for Computing Large Kernel Eigensystems , 2009, Neural Computation.

[18]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[20]  Ivor W. Tsang,et al.  Improved Nyström low-rank approximation and error analysis , 2008, ICML '08.

[21]  Arindam Banerjee,et al.  Active Semi-Supervision for Pairwise Constrained Clustering , 2004, SDM.

[22]  Alexander Zien,et al.  Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.

[23]  Edward Y. Chang,et al.  Parallel Spectral Clustering , 2008, ECML/PKDD.

[24]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[25]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[26]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[27]  Maoguo Gong,et al.  Fast density-weighted low-rank approximation spectral clustering , 2011, Data Mining and Knowledge Discovery.

[28]  A. Hoffman,et al.  Lower bounds for the partitioning of graphs , 1973 .

[29]  Ling Huang,et al.  Fast approximate spectral clustering , 2009, KDD.

[30]  Chris H. Q. Ding,et al.  A min-max cut algorithm for graph partitioning and data clustering , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[31]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[32]  Jitendra Malik,et al.  Spectral grouping using the Nystrom method , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Mohamed-Ali Belabbas,et al.  Spectral methods in machine learning and new strategies for very large datasets , 2009, Proceedings of the National Academy of Sciences.

[34]  Yi Yang,et al.  Image Clustering Using Local Discriminant Models and Global Integration , 2010, IEEE Transactions on Image Processing.

[35]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..