Guided Co-training for Large-Scale Multi-View Spectral Clustering

In many real-world applications, we have access to multiple views of the data, each of which characterizes the data from a distinct aspect. Several previous algorithms have demonstrated that one can achieve better clustering accuracy by integrating information from all views appropriately than using only an individual view. Owing to the effectiveness of spectral clustering, many multi-view clustering methods are based on it. Unfortunately, they have limited applicability to large-scale data due to the high computational complexity of spectral clustering. In this work, we propose a novel multi-view spectral clustering method for large-scale data. Our approach is structured under the guided co-training scheme to fuse distinct views, and uses the sampling technique to accelerate spectral clustering. More specifically, we first select $p$ ($\ll n$) landmark points and then approximate the eigen-decomposition accordingly. The augmented view, which is essential to guided co-training process, can then be quickly determined by our method. The proposed algorithm scales linearly with the number of given data. Extensive experiments have been performed and the results support the advantage of our method for handling the large-scale multi-view situation.

[1]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[2]  Tyng-Luh Liu,et al.  Guided co-training for multi-view spectral clustering , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[3]  Jiawei Han,et al.  Multi-View Clustering via Joint Nonnegative Matrix Factorization , 2013, SDM.

[4]  Xinlei Chen,et al.  Large Scale Spectral Clustering with Landmark-Based Representation , 2011, AAAI.

[5]  Hal Daumé,et al.  A Co-training Approach for Multi-view Spectral Clustering , 2011, ICML.

[6]  Nguyen Lu Dang Khoa,et al.  Large Scale Spectral Clustering Using Resistance Distance and Spielman-Teng Solvers , 2012, Discovery Science.

[7]  Atsushi Imiya,et al.  Fast Spectral Clustering with Random Projection and Sampling , 2009, MLDM.

[8]  Francesco Masulli,et al.  A survey of kernel and spectral methods for clustering , 2008, Pattern Recognit..

[9]  Enhong Chen,et al.  Learning Deep Representations for Graph Clustering , 2014, AAAI.

[10]  Ming Shao,et al.  Deep Linear Coding for Fast Graph Clustering , 2015, IJCAI.

[11]  William W. Cohen,et al.  Power Iteration Clustering , 2010, ICML.

[12]  Mikhail Belkin,et al.  A Co-Regularization Approach to Semi-supervised Learning with Multiple Views , 2005 .

[13]  Xiaochun Cao,et al.  Low-Rank Tensor Constrained Multiview Subspace Clustering , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Shang-Hua Teng,et al.  Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems , 2003, STOC '04.

[15]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[16]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[17]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[18]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[19]  Kotagiri Ramamohanarao,et al.  Approximate Spectral Clustering , 2009, PAKDD.

[20]  Ameet Talwalkar,et al.  Sampling Methods for the Nyström Method , 2012, J. Mach. Learn. Res..

[21]  Lei Du,et al.  Robust Multi-View Spectral Clustering via Low-Rank and Sparse Decomposition , 2014, AAAI.

[22]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[23]  Feiping Nie,et al.  Heterogeneous image feature integration via multi-modal spectral clustering , 2011, CVPR 2011.

[24]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Hal Daumé,et al.  Co-regularized Multi-view Spectral Clustering , 2011, NIPS.

[26]  Ling Huang,et al.  Fast approximate spectral clustering , 2009, KDD.

[27]  Ying Cui,et al.  Multiple Kernel Learning Based Multi-view Spectral Clustering , 2014, 2014 22nd International Conference on Pattern Recognition.

[28]  Feiping Nie,et al.  Large-Scale Multi-View Spectral Clustering via Bipartite Graph , 2015, AAAI.

[29]  Yung-Yu Chuang,et al.  Affinity aggregation for spectral clustering , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Jitendra Malik,et al.  Spectral grouping using the Nystrom method , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Xiaochun Cao,et al.  Diversity-induced Multi-view Subspace Clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Minoru Sasaki,et al.  Spectral Clustering for a Large Data Set by Reducing the Similarity Matrix Size , 2008, LREC.

[33]  Junsong Yuan,et al.  Multi-feature Spectral Clustering with Minimax Optimization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[35]  Massih-Reza Amini,et al.  Learning from Multiple Partially Observed Views - an Application to Multilingual Text Categorization , 2009, NIPS.

[36]  Christos Boutsidis,et al.  Spectral Clustering via the Power Method - Provably , 2013, ICML.

[37]  Wei Liu,et al.  Scalable Sequential Spectral Clustering , 2016, AAAI.