FUSE: Full Spectral Clustering

Multi-scale data which contains structures at different scales of size and density is a big challenge for spectral clustering. Even given a suitable locally scaled affinity matrix, the first k eigenvectors of such a matrix still cannot separate clusters well. Thus, in this paper, we exploit the fusion of the cluster-separation information from all eigenvectors to achieve a better clustering result. Our method FUll Spectral ClustEring (FUSE) is based on Power Iteration (PI) and Independent Component Analysis (ICA). PI is used to fuse all eigenvectors to one pseudo-eigenvector which inherits all the cluster-separation information. To conquer the cluster-collision problem, we utilize PI to generate p (p > k) pseudo-eigenvectors. Since these pseudo-eigenvectors are redundant and the cluster-separation information is contaminated with noise, ICA is adopted to rotate the pseudo-eigenvectors to make them pairwise statistically independent. To let ICA overcome local optima and speed up the search process, we develop a self-adaptive and self-learning greedy search method. Finally, we select k rotated pseudo-eigenvectors (independent components) which have more cluster-separation information measured by kurtosis for clustering. Various synthetic and real-world data verifies the effectiveness and efficiency of our FUSE method.

[1]  Shaogang Gong,et al.  Spectral clustering with eigenvector selection , 2008, Pattern Recognit..

[2]  James Bailey,et al.  Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance , 2010, J. Mach. Learn. Res..

[3]  Jitendra Malik,et al.  Spectral grouping using the Nystrom method , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Barnabás Póczos,et al.  Undercomplete Blind Subspace Deconvolution , 2007, J. Mach. Learn. Res..

[5]  Hao Huang,et al.  Diverse Power Iteration Embeddings and Its Applications , 2014, 2014 IEEE International Conference on Data Mining.

[6]  Frank Lin,et al.  Scalable Methods for Graph-Based Unsupervised and Semi-Supervised Learning , 2012 .

[7]  William W. Cohen,et al.  Power Iteration Clustering , 2010, ICML.

[8]  Zhenguo Li,et al.  Noise Robust Spectral Clustering , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[9]  Pietro Perona,et al.  Self-Tuning Spectral Clustering , 2004, NIPS.

[10]  Christian Böhm,et al.  Outlier-robust clustering using independent components , 2008, SIGMOD Conference.

[11]  Young-Koo Lee,et al.  Deflation-based power iteration clustering , 2013, Applied Intelligence.

[12]  William W. Cohen,et al.  A Very Fast Method for Clustering Big Text Datasets , 2010, ECAI.

[13]  John W. Fisher,et al.  ICA Using Spacings Estimates of Entropy , 2003, J. Mach. Learn. Res..

[14]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Ling Huang,et al.  Fast approximate spectral clustering , 2009, KDD.

[16]  Jianbo Shi,et al.  A Random Walks View of Spectral Segmentation , 2001, AISTATS.

[17]  Barnabás Póczos,et al.  ICA and ISA using Schweizer-Wolff measure of dependence , 2008, ICML '08.

[18]  Meirav Galun,et al.  Fundamental Limitations of Spectral Clustering , 2006, NIPS.

[19]  Peter Lindstrom,et al.  Locally-scaled spectral clustering using empty region graphs , 2012, KDD.

[20]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[21]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[22]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[23]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.