An improved spectral clustering algorithm based on random walk

The construction process for a similarity matrix has an important impact on the performance of spectral clustering algorithms. In this paper, we propose a random walk based approach to process the Gaussian kernel similarity matrix. In this method, the pair-wise similarity between two data points is not only related to the two points, but also related to their neighbors. As a result, the new similarity matrix is closer to the ideal matrix which can provide the best clustering result. We give a theoretical analysis of the similarity matrix and apply this similarity matrix to spectral clustering. We also propose a method to handle noisy items which may cause deterioration of clustering performance. Experimental results on real-world data sets show that the proposed spectral clustering algorithm significantly outperforms existing algorithms.

[1]  Jitendra Malik,et al.  Spectral grouping using the Nystrom method , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Helen C. Shen,et al.  Semi-Supervised Classification Using Linear Neighborhood Propagation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[4]  Zheng Tian,et al.  Spectral clustering based on matrix perturbation theory , 2007, Science in China Series F: Information Sciences.

[5]  Chris H. Q. Ding,et al.  A min-max cut algorithm for graph partitioning and data clustering , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[6]  Miguel Á. Carreira-Perpiñán,et al.  Constrained spectral clustering through affinity propagation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Fei Wang,et al.  Robust self-tuning semi-supervised learning , 2007, Neurocomputing.

[8]  Serge J. Belongie,et al.  Model-based halftoning for color image segmentation , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[9]  Andrew B. Kahng,et al.  New spectral methods for ratio cut partitioning and clustering , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[10]  Inderjit S. Dhillon,et al.  Generative model-based clustering of directional data , 2003, KDD '03.

[11]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[12]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Joachim M. Buhmann,et al.  On spatial quantization of color images , 2000, IEEE Trans. Image Process..

[14]  Charles A. Micchelli,et al.  On Spectral Learning , 2010, J. Mach. Learn. Res..

[15]  Pietro Perona,et al.  Self-Tuning Spectral Clustering , 2004, NIPS.

[16]  Charles R. Johnson Matrix theory and applications , 1990 .

[17]  Tian Zheng,et al.  Spectral clustering based on matrix perturbation theory , 2007 .

[18]  Jianbo Shi,et al.  A Random Walks View of Spectral Segmentation , 2001, AISTATS.

[19]  François Fouss,et al.  Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation , 2007, IEEE Transactions on Knowledge and Data Engineering.

[20]  Zoubin Ghahramani,et al.  Spectral Methods for Automatic Multiscale Data Clustering , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[21]  Kotagiri Ramamohanarao,et al.  Approximate Spectral Clustering , 2009, PAKDD.

[22]  L. Asz Random Walks on Graphs: a Survey , 2022 .