Discrete Nonnegative Spectral Clustering

Spectral clustering has been playing a vital role in various research areas. Most traditional spectral clustering algorithms comprise two independent stages (e.g., first learning continuous labels and then rounding the learned labels into discrete ones), which may cause unpredictable deviation of resultant cluster labels from genuine ones, thereby leading to severe information loss and performance degradation. In this work, we study how to achieve discrete clustering as well as reliably generalize to unseen data. We propose a novel spectral clustering scheme which deeply explores cluster label properties, including discreteness, nonnegativity, and discrimination, as well as learns robust out-of-sample prediction functions. Specifically, we explicitly enforce a discrete transformation on the intermediate continuous labels, which leads to a tractable optimization problem with a discrete solution. Besides, we preserve the natural nonnegative characteristic of the clustering labels to enhance the interpretability of the results. Moreover, to further compensate the unreliability of the learned clustering labels, we integrate an adaptive robust module with $\ell _{2,p}$ loss to learn prediction function for grouping unseen data. We also show that the out-of-sample component can inject discriminative knowledge into the learning of cluster labels under certain conditions. Extensive experiments conducted on various data sets have demonstrated the superiority of our proposal as compared to several existing clustering approaches.

[1]  Chris H. Q. Ding,et al.  Spectral Relaxation for K-means Clustering , 2001, NIPS.

[2]  Aidong Zhang,et al.  Cluster analysis for gene expression data: a survey , 2004, IEEE Transactions on Knowledge and Data Engineering.

[3]  Francesco Masulli,et al.  A survey of kernel and spectral methods for clustering , 2008, Pattern Recognit..

[4]  Shaogang Gong,et al.  Constrained Clustering With Imperfect Oracles , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[5]  Heng Tao Shen,et al.  Hashing on Nonlinear Manifolds , 2014, IEEE Transactions on Image Processing.

[6]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[7]  Mikhail Belkin,et al.  The Hidden Convexity of Spectral Clustering , 2014, AAAI.

[8]  Shiri Gordon,et al.  Applying the information bottleneck principle to unsupervised clustering of discrete and continuous image representations , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[9]  Wei-Ying Ma,et al.  Annotating Images by Mining Image Search Results , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Alessandro Laio,et al.  Clustering by fast search and find of density peaks , 2014, Science.

[11]  Takeo Kanade,et al.  Discriminative cluster analysis , 2006, ICML.

[12]  Xindong Wu,et al.  Learning on Big Graph: Label Inference and Regularization with Anchor Hierarchy , 2017, IEEE Transactions on Knowledge and Data Engineering.

[13]  Meng Wang,et al.  Visual Classification by ℓ1-Hypergraph Modeling , 2015, IEEE Trans. Knowl. Data Eng..

[14]  Xuelong Li,et al.  A Class of Manifold Regularized Multiplicative Update Algorithms for Image Clustering , 2015, IEEE Transactions on Image Processing.

[15]  R. Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications. , 2013, IEEE transactions on pattern analysis and machine intelligence.

[16]  Ivor W. Tsang,et al.  Spectral Embedded Clustering: A Framework for In-Sample and Out-of-Sample Spectral Clustering , 2011, IEEE Transactions on Neural Networks.

[17]  Yang Yang,et al.  Face image classification by pooling raw features , 2014, Pattern Recognit..

[18]  Ah-Hwee Tan,et al.  Adaptive Scaling of Cluster Boundaries for Large-Scale Social Media Data Clustering , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Rongrong Ji,et al.  Nonnegative Spectral Clustering with Discriminative Regularization , 2011, AAAI.

[20]  Ming Gao,et al.  BiRank: Towards Ranking on Bipartite Graphs , 2017, IEEE Transactions on Knowledge and Data Engineering.

[21]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[22]  Xuelong Li,et al.  Multi-view Subspace Clustering , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  Jieping Ye,et al.  Adaptive Distance Metric Learning for Clustering , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[25]  Thomas Brox,et al.  Spectral Graph Reduction for Efficient Image and Streaming Video Segmentation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Min-Yen Kan,et al.  Comment-based multi-view clustering of web 2.0 items , 2014, WWW.

[27]  Meng Wang,et al.  Scalable Semi-Supervised Learning by Efficient Anchor Graph Regularization , 2016, IEEE Transactions on Knowledge and Data Engineering.

[28]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  Yue Gao,et al.  Exploiting Web Images for Semantic Video Indexing Via Robust Sample-Specific Loss , 2014, IEEE Transactions on Multimedia.

[30]  Bo Li,et al.  Information Theoretic Subspace Clustering , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[31]  Bernhard Schölkopf,et al.  A Local Learning Approach for Clustering , 2006, NIPS.

[32]  Fei Wang,et al.  Clustering with Local and Global Regularization , 2007, IEEE Transactions on Knowledge and Data Engineering.

[33]  Jieping Ye,et al.  Discriminative K-means for Clustering , 2007, NIPS.

[34]  Bingbing Ni,et al.  Facilitating Image Search With a Scalable and Compact Semantic Mapping , 2015, IEEE Transactions on Cybernetics.

[35]  Lei Du,et al.  Robust Multi-View Spectral Clustering via Low-Rank and Sparse Decomposition , 2014, AAAI.

[36]  Wotao Yin,et al.  A feasible method for optimization with orthogonality constraints , 2013, Math. Program..

[37]  Zi Huang,et al.  Robust discrete code modeling for supervised hashing , 2018, Pattern Recognit..

[38]  Zhang Yi,et al.  A Unified Framework for Representation-Based Subspace Clustering of Out-of-Sample and Large-Scale Data , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[39]  Aoying Zhou,et al.  An adaptive and dynamic dimensionality reduction method for high-dimensional indexing , 2007, The VLDB Journal.

[40]  Yang Yang,et al.  Multitask Spectral Clustering by Exploring Intertask Correlation , 2015, IEEE Transactions on Cybernetics.

[41]  Yi Yang,et al.  Discriminative Nonnegative Spectral Clustering with Out-of-Sample Extension , 2013, IEEE Transactions on Knowledge and Data Engineering.

[42]  Jitendra Malik,et al.  Spectral grouping using the Nystrom method , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Zi Huang,et al.  A Unified Framework for Discrete Spectral Clustering , 2016, IJCAI.

[44]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[45]  Feiping Nie,et al.  Spectral Rotation versus K-Means in Spectral Clustering , 2013, AAAI.

[46]  Xuelong Li,et al.  Robust Discrete Spectral Hashing for Large-Scale Image Semantic Indexing , 2015, IEEE Transactions on Big Data.

[47]  Matthias Hein,et al.  Spectral clustering based on the graph p-Laplacian , 2009, ICML '09.

[48]  Junjie Wu,et al.  Spectral Ensemble Clustering , 2015, KDD.

[49]  Nicolas Le Roux,et al.  Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering , 2003, NIPS.

[50]  Yao Zhao,et al.  Topographic NMF for Data Representation , 2014, IEEE Transactions on Cybernetics.

[51]  Yang Yang,et al.  Multimedia Summarization for Social Events in Microblog Stream , 2015, IEEE Transactions on Multimedia.