Adaptive Graph Regularized Low–Rank Matrix Factorization With Noise and Outliers for Clustering

Clustering, which is a commonly used tool, has been applied in machine learning, data mining and so on, and has received extensive research. However, there are usually noise and outliers in the data, which will bring about significant errors in the clustering results. In this paper, a robust clustering model with adaptive graph regularization (RCAG) is proposed, on which, sparse error matrix is introduced to express sparse noise, such as impulse noise, dead line, stripes, and $\ell _{1}$ norm is introduced to alleviate the sparse noise. In addition, the $\ell _{2,1}$ norm is also proposed mitigating the effects of outliers, and it has rotation invariance property. Therefore, our RCAG is insensitive to data noise and outliers. More importantly, the adaptive graph regularization is introduced into the RCAG to improve the clustering performance. Aiming at the optimization objective, we propose an iterative updating algorithm, named the Augmented Lagrangian Method (ALM), to update each optimization variable respectively. The convergence and time complexity of RCAG is also proved in theory. Finally, experimental results on fourteen datasets of four application scenarios, such as face image, handwriting recognition and UCI, elaborate the superiority of proposed method over seven existing classical clustering methods. The experimental results demonstrate that our approach achieves better clustering performance in ACC and Purity, which is a little less impressive in other ways.

[1]  Jane You,et al.  Adaptive Manifold Regularized Matrix Factorization for Data Clustering , 2017, IJCAI.

[2]  En Zhu,et al.  Subspace segmentation-based robust multiple kernel clustering , 2020, Inf. Fusion.

[3]  Jiashi Feng,et al.  Deep Clustering With Sample-Assignment Invariance Prior , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Chris H. Q. Ding,et al.  Robust nonnegative matrix factorization using L21-norm , 2011, CIKM '11.

[5]  Quanquan Gu,et al.  Local Learning Regularized Nonnegative Matrix Factorization , 2009, IJCAI.

[6]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[7]  Qiang Liu,et al.  Random Fourier extreme learning machine with ℓ2, 1-norm regularization , 2016, Neurocomputing.

[8]  Wu Jigang,et al.  Unsupervised feature extraction by low-rank and sparsity preserving embedding , 2019, Neural Networks.

[9]  Chun-Hou Zheng,et al.  Sparse Graph Regularization Non-Negative Matrix Factorization Based on Huber Loss Model for Cancer Data Analysis , 2019, Front. Genet..

[10]  Xuelong Li,et al.  Constrained Nonnegative Matrix Factorization for Image Representation , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Qiang Liu,et al.  Spectral clustering-based local and global structure preservation for feature selection , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[12]  Mohamed Nadif,et al.  A Semi-NMF-PCA Unified Framework for Data Clustering , 2017, IEEE Transactions on Knowledge and Data Engineering.

[13]  Bin Sun,et al.  Hyperspectral Images Denoising via Nonconvex Regularized Low-Rank and Sparse Matrix Decomposition , 2020, IEEE Transactions on Image Processing.

[14]  Fuming Sun,et al.  Graph regularized and sparse nonnegative matrix factorization with hard constraints for data representation , 2016, Neurocomputing.

[15]  Inderjit S. Dhillon,et al.  Using Side Information to Reliably Learn Low-Rank Matrices from Missing and Corrupted Observations , 2018, J. Mach. Learn. Res..

[16]  Haojie Li,et al.  Sparse dual graph-regularized NMF for image co-clustering , 2018, Neurocomputing.

[17]  Deng Cai,et al.  Probabilistic dyadic data analysis with local and global consistency , 2009, ICML '09.

[18]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[19]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[20]  Jiancheng Lv,et al.  COMIC: Multi-view Clustering Without Parameter Selection , 2019, ICML.

[21]  Yong Yu,et al.  Robust Recovery of Subspace Structures by Low-Rank Representation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Jianmin Chen,et al.  Robust collaborative filtering based on non-negative matrix factorization and R1-norm , 2017, Knowl. Based Syst..

[23]  Jun Wang,et al.  Robust Low-rank subspace segmentation with finite mixture noise , 2019, Pattern Recognit..

[24]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[25]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[26]  Wei-Yun Yau,et al.  Structured AutoEncoders for Subspace Clustering , 2018, IEEE Transactions on Image Processing.

[27]  Yuan Yan Tang,et al.  A Novel Rank Approximation Method for Mixture Noise Removal of Hyperspectral Images , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[28]  Jiawei Han,et al.  Non-negative Matrix Factorization on Manifold , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[29]  Wu He,et al.  Low-rank representation with graph regularization for subspace clustering , 2017, Soft Comput..

[30]  Xuelong Li,et al.  Robust Adaptive Graph Regularized Non-Negative Matrix Factorization , 2019, IEEE Access.

[31]  Jin-Xing Liu,et al.  Hypergraph regularized NMF by L2,1-norm for Clustering and Com-abnormal Expression Genes Selection , 2018, 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[32]  Ming Dong,et al.  Exemplar-based low-rank matrix decomposition for data clustering , 2014, Data Mining and Knowledge Discovery.

[33]  Junfeng Yang,et al.  Linearized augmented Lagrangian and alternating direction methods for nuclear norm minimization , 2012, Math. Comput..

[34]  Zenglin Xu,et al.  Robust graph regularized nonnegative matrix factorization for clustering , 2017, Data Mining and Knowledge Discovery.

[35]  A. Hoecker,et al.  SVD APPROACH TO DATA UNFOLDING , 1995, hep-ph/9509307.

[36]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[37]  Feiping Nie,et al.  Robust Manifold Nonnegative Matrix Factorization , 2014, ACM Trans. Knowl. Discov. Data.

[38]  Hongwei Liu,et al.  Solving non-negative matrix factorization by alternating least squares with a modified strategy , 2013, Data Mining and Knowledge Discovery.

[39]  Chris H. Q. Ding,et al.  Convex and Semi-Nonnegative Matrix Factorizations , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Feiping Nie,et al.  Fast Semisupervised Learning With Bipartite Graph for Large-Scale Data , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[41]  Yi Ma,et al.  The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.

[42]  G. Sapiro,et al.  A collaborative framework for 3D alignment and classification of heterogeneous subvolumes in cryo-electron tomography. , 2013, Journal of structural biology.

[43]  Zhang Yi,et al.  Connections Between Nuclear-Norm and Frobenius-Norm-Based Representations , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[44]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[45]  Xinwang Liu,et al.  Multiple Kernel Clustering With Neighbor-Kernel Subspace Segmentation , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[46]  Teng Zhang,et al.  Robust PCA by Manifold Optimization , 2017, J. Mach. Learn. Res..

[47]  Chun-Hou Zheng,et al.  Integrative Hypergraph Regularization Principal Component Analysis for Sample Clustering and Co-Expression Genes Network Analysis on Multi-Omics Data , 2020, IEEE Journal of Biomedical and Health Informatics.

[48]  Xiaojun Wu,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.