Robust multi-view data clustering with multi-view capped-norm K-means

Abstract Real-world data sets are often comprised of multiple representations or views which provide different and complementary aspects of information. Multi-view clustering is an important approach to analyze multi-view data in a unsupervised way. Previous studies have shown that better clustering accuracy can be achieved using integrated information from all the views rather than just relying on each view individually. That is, the hidden patterns in data can be better explored by discovering the common latent structure shared by multiple views. However, traditional multi-view clustering methods are usually sensitive to noises and outliers, which greatly impair the clustering performance in practical problems. Furthermore, existing multi-view clustering methods, e.g. graph-based methods, are with high computational complexity due to the kernel/affinity matrix construction or the eigendecomposition. To address these problems, we propose a novel robust multi-view clustering method to integrate heterogeneous representations of data. To make our method robust to the noises and outliers, especially the extreme data outliers, we utilize the capped-norm loss as the objective. The proposed method is of low complexity, and in the same level as the classic K-means algorithm, which is a major advantage for unsupervised learning. We derive a new efficient optimization algorithm to solve the multi-view clustering problem. Finally, extensive experiments on benchmark data sets show that our proposed method consistently outperforms the state-of-the-art clustering methods.

[1]  Yves Lechevallier,et al.  A multi-view relational fuzzy c-medoid vectors clustering algorithm , 2015, Neurocomputing.

[2]  Ying Cui,et al.  Non-redundant Multi-view Clustering via Orthogonalization , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[3]  Hal Daumé,et al.  A Co-training Approach for Multi-view Spectral Clustering , 2011, ICML.

[4]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[5]  Chris H. Q. Ding,et al.  Convex and Semi-Nonnegative Matrix Factorizations , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Derek Greene,et al.  A Matrix Factorization Approach for Integrating Multiple Data Views , 2009, ECML/PKDD.

[7]  Zenglin Xu,et al.  Robust graph regularized nonnegative matrix factorization for clustering , 2017, Data Mining and Knowledge Discovery.

[8]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Zenglin Xu,et al.  Nonnegative matrix factorization with adaptive neighbors , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[10]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[11]  Aristidis Likas,et al.  Kernel-Based Weighted Multi-view Clustering , 2012, 2012 IEEE 12th International Conference on Data Mining.

[12]  Yuhong Guo,et al.  Convex Subspace Representation Learning from Multi-View Data , 2013, AAAI.

[13]  Feiping Nie,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Multi-View K-Means Clustering on Big Data , 2022 .

[14]  Xuelong Li,et al.  Multi-view Subspace Clustering , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[15]  Philip S. Yu,et al.  Multi-View Clustering Based on Belief Propagation , 2016, IEEE Transactions on Knowledge and Data Engineering.

[16]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[17]  Steffen Bickel,et al.  Multi-view clustering , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[18]  Yong Dou,et al.  A novel multi-view clustering method via low-rank and matrix-induced regularization , 2016, Neurocomputing.

[19]  Hal Daumé,et al.  Co-regularized Multi-view Spectral Clustering , 2011, NIPS.

[20]  Zenglin Xu,et al.  Adaptive Regularization for Transductive Support Vector Machine , 2009, NIPS.

[21]  Hassan Abolhassani,et al.  Harmony K-means algorithm for document clustering , 2009, Data Mining and Knowledge Discovery.

[22]  Liang Wang,et al.  Multi-view clustering via pairwise sparse subspace representation , 2015, Neurocomputing.

[23]  Christoph H. Lampert,et al.  Correlational spectral clustering , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Feiping Nie,et al.  Robust Capped Norm Nonnegative Matrix Factorization: Capped Norm NMF , 2015, CIKM.

[25]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[26]  Zenglin Xu,et al.  Adaptive local structure learning for document co-clustering , 2018, Knowl. Based Syst..

[27]  Zenglin Xu,et al.  Sparse Matrix-Variate t Process Blockmodels , 2011, AAAI.

[28]  Hong Yu,et al.  Multi-view clustering via multi-manifold regularized non-negative matrix factorization , 2017, Neural Networks.

[29]  Xuan Li,et al.  Local and global structure preserving based feature selection , 2012, Neurocomputing.

[30]  Paul L. Rosin,et al.  Image and Video-Based Artistic Stylisation , 2012, Computational Imaging and Vision.

[31]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Sham M. Kakade,et al.  Multi-view clustering via canonical correlation analysis , 2009, ICML '09.

[33]  Martha White,et al.  Convex Multi-view Subspace Learning , 2012, NIPS.

[34]  Zenglin Xu,et al.  Bayesian Nonparametric Models for Multiway Data Analysis , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Chang-Dong Wang,et al.  Weighted Multi-view Clustering with Feature Selection , 2016, Pattern Recognit..

[36]  Zenglin Xu,et al.  Infinite Tucker Decomposition: Nonparametric Bayesian Models for Multiway Data Analysis , 2011, ICML.

[37]  Chang-Dong Wang,et al.  Multi-view collaborative locally adaptive clustering with Minkowski metric , 2017, Expert Syst. Appl..

[38]  Ivor W. Tsang,et al.  Spectral Embedded Clustering: A Framework for In-Sample and Out-of-Sample Spectral Clustering , 2011, IEEE Transactions on Neural Networks.

[39]  Gilles Bisson,et al.  Co-clustering of Multi-view Datasets: A Parallelizable Approach , 2012, 2012 IEEE 12th International Conference on Data Mining.

[40]  Daoqiang Zhang,et al.  Fast and robust fuzzy c-means clustering algorithms incorporating local information for image segmentation , 2007, Pattern Recognit..

[41]  Jiawei Han,et al.  Multi-View Clustering via Joint Nonnegative Matrix Factorization , 2013, SDM.

[42]  Mingjing Li,et al.  Color texture moments for content-based image retrieval , 2002, Proceedings. International Conference on Image Processing.

[43]  Xianchao Zhang,et al.  Multi-Task Multi-View Clustering for Non-Negative Data , 2015, IJCAI.

[44]  Shao-Yuan Li,et al.  Partial Multi-View Clustering , 2014, AAAI.

[45]  Carlotta Domeniconi,et al.  Weighted-object ensemble clustering: methods and analysis , 2016, Knowledge and Information Systems.

[46]  Tao Li,et al.  Constraint Co-Projections for Semi-Supervised Co-Clustering , 2016, IEEE Transactions on Cybernetics.

[47]  Zhao Kang,et al.  Kernel-driven similarity learning , 2017, Neurocomputing.

[48]  Feiping Nie,et al.  Large-Scale Multi-View Spectral Clustering via Bipartite Graph , 2015, AAAI.

[49]  Chris H. Q. Ding,et al.  Robust nonnegative matrix factorization using L21-norm , 2011, CIKM '11.

[50]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[51]  Dingcheng Li,et al.  Spectral co-clustering ensemble , 2015, Knowl. Based Syst..