Scalable Auto-weighted Discrete Multi-view Clustering

Multi-view clustering has been widely studied in machine learning, which uses complementary information to improve clustering performance. However, challenges remain when handling large-scale multi-view data due to the traditional approaches’ high time complexity. Besides, the existing approaches suffer from parameter selection. Due to the lack of labeled data, parameter selection in practical clustering applications is difficult, especially in big data. In this paper, we propose a novel approach for large-scale multi-view clustering to overcome the above challenges. Our approach focuses on learning the low-dimensional binary embedding of multi-view data, preserving the samples’ local structure during binary embedding, and optimizing the embedding and clustering in a unified framework. Furthermore, we proposed to learn the parameters using a combination of data-driven and heuristic approaches. Experiments on five large-scale multi-view datasets show that the proposed method is superior to the state-of-the-art in terms of clustering quality and running time.

[1]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[2]  Inderjit S. Dhillon,et al.  Concept Decompositions for Large Sparse Text Data Using Clustering , 2004, Machine Learning.

[3]  Jeff A. Bilmes,et al.  On Deep Multi-View Representation Learning , 2015, ICML.

[4]  Shiliang Sun,et al.  A survey of multi-view machine learning , 2013, Neural Computing and Applications.

[5]  Lin Wu,et al.  Robust Subspace Clustering for Multi-View Data by Exploiting Correlation Consensus , 2015, IEEE Transactions on Image Processing.

[6]  Qingming Huang,et al.  Split Multiplicative Multi-View Subspace Clustering , 2019, IEEE Transactions on Image Processing.

[7]  Christopher J. C. Burges,et al.  Spectral clustering and transductive learning with multiple views , 2007, ICML '07.

[8]  Longqi Yang,et al.  Online Binary Incomplete Multi-view Clustering , 2020, ECML/PKDD.

[9]  Feiping Nie,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Multi-View K-Means Clustering on Big Data , 2022 .

[10]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Sean Hughes,et al.  Clustering by Fast Search and Find of Density Peaks , 2016 .

[12]  Xinlei Chen,et al.  Large Scale Spectral Clustering with Landmark-Based Representation , 2011, AAAI.

[13]  Hal Daumé,et al.  Co-regularized Multi-view Spectral Clustering , 2011, NIPS.

[14]  Aidong Zhang,et al.  Cluster analysis for gene expression data: a survey , 2004, IEEE Transactions on Knowledge and Data Engineering.

[15]  Xuelong Li,et al.  Auto-Weighted Multi-View Learning for Image Clustering and Semi-Supervised Classification , 2018, IEEE Transactions on Image Processing.

[16]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[17]  Yang Yang,et al.  A Fast Optimization Method for General Binary Code Learning , 2016, IEEE Transactions on Image Processing.

[18]  Xuelong Li,et al.  Parameter-Free Auto-Weighted Multiple Graph Learning: A Framework for Multiview Clustering and Semi-Supervised Classification , 2016, IJCAI.

[19]  Xiao Wang,et al.  One2Multi Graph Autoencoder for Multi-view Graph Clustering , 2020, WWW.

[20]  Yun Fu,et al.  Multi-View Clustering via Deep Matrix Factorization , 2017, AAAI.

[21]  Jiawei Han,et al.  Multi-View Clustering via Joint Nonnegative Matrix Factorization , 2013, SDM.

[22]  Jeff A. Bilmes,et al.  Deep Canonical Correlation Analysis , 2013, ICML.

[23]  Shiliang Sun,et al.  A Survey on Multiview Clustering , 2017, IEEE Transactions on Artificial Intelligence.

[24]  Philip S. Yu,et al.  Online multi-view clustering with incomplete views , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[25]  Feiping Nie,et al.  Large-Scale Multi-View Spectral Clustering via Bipartite Graph , 2015, AAAI.

[26]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[27]  Xuelong Li,et al.  Multi-view Subspace Clustering , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Hao Wang,et al.  Spectral Perturbation Meets Incomplete Multi-view Data , 2019, IJCAI.

[29]  Ling Shao,et al.  Binary Multi-View Clustering , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Xian-Sheng Hua,et al.  Ensemble Manifold Regularization , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Jiancheng Lv,et al.  COMIC: Multi-view Clustering Without Parameter Selection , 2019, ICML.

[32]  Wei Liu,et al.  Large Graph Construction for Scalable Semi-Supervised Learning , 2010, ICML.

[33]  Dacheng Tao,et al.  A Survey on Multi-view Learning , 2013, ArXiv.

[34]  Xin Wang,et al.  Robust Auto-Weighted Multi-View Clustering , 2018, IJCAI.

[35]  Nicu Sebe,et al.  A Survey on Learning to Hash , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Zenglin Xu,et al.  Large-scale Multi-view Subspace Clustering in Linear Time , 2019, AAAI.