Joint image clustering and feature selection with auto-adjoined learning for high-dimensional data

Abstract Due to the rapid development of modern multimedia techniques, high-dimensional image data are frequently encountered in many image analysis communities, such as clustering and feature learning. K-means (KM) is one of the widely-used and efficient tools for clustering high-dimensional data. However, as the commonly contained irrelevant features or noise, conventional KMs suffer from degraded performance for high-dimensional data. Recent studies try to overcome this problem by combining KMs with subspace learning. Nevertheless, they usually depend on complex eigenvalue decomposition, which needs expensive computation resources. Besides, their clustering models also ignore the local manifold structure among data, failing to utilize the underlying adjacent information. Two points are critical for clustering high-dimensional image data: efficient feature selecting and clear adjacency exploring. Based on the above consideration, we propose an auto-adjoined subspace clustering. Concretely, to efficiently locate the redundant features, we impose an extremely sparse feature selection matrix into KM, which is easy to be optimized. Besides, to accurately encode the local adjacency among data without the influence of noise, we propose to automatically assign the connectivity of each sample in the low-dimensional feature space. Compared with several state-of-the-art clustering methods, the proposed method constantly improves the clustering performance on six publicly available benchmark image datasets, demonstrating the effectiveness of our method.

[1]  Feiping Nie,et al.  Clustering and projected clustering with adaptive neighbors , 2014, KDD.

[2]  Yahong Han,et al.  Discriminative multi-task multi-view feature selection and fusion for multimedia analysis , 2018, Multimedia Tools and Applications.

[3]  Chris H. Q. Ding,et al.  Spectral Relaxation for K-means Clustering , 2001, NIPS.

[4]  Yi Yang,et al.  Discriminative Nonnegative Spectral Clustering with Out-of-Sample Extension , 2013, IEEE Transactions on Knowledge and Data Engineering.

[5]  Minnan Luo,et al.  Self-weighted Robust LDA for Multiclass Classification with Edge Classes , 2020, ACM Trans. Intell. Syst. Technol..

[6]  Xuelong Li,et al.  Efficient Clustering Based On A Unified View Of $K$-means And Ratio-cut , 2020, NeurIPS.

[7]  Chengqi Zhang,et al.  Convex Sparse PCA for Unsupervised Feature Learning , 2014, ACM Trans. Knowl. Discov. Data.

[8]  Wenhua Wang,et al.  Local and Global Regressive Mapping for Manifold Learning with Out-of-Sample Extrapolation , 2010, AAAI.

[9]  Feiping Nie,et al.  Multilevel projections with adaptive neighbor graph for unsupervised multi-view feature selection , 2021, Inf. Fusion.

[10]  Yunming Ye,et al.  DSKmeans: A new kmeans-type approach to discriminative subspace clustering , 2014, Knowl. Based Syst..

[11]  Xiaofeng Zhu,et al.  Local and Global Structure Preservation for Robust Unsupervised Spectral Feature Selection , 2018, IEEE Transactions on Knowledge and Data Engineering.

[12]  Xiaodong Wang,et al.  Local adaptive learning for semi-supervised feature selection with group sparsity , 2019, Knowl. Based Syst..

[13]  Zi Huang,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence ℓ2,1-Norm Regularized Discriminative Feature Selection for Unsupervised Learning , 2022 .

[14]  Zhihui Li,et al.  Diverse fuzzy c-means for image clustering , 2020, Pattern Recognit. Lett..

[15]  Xiaojun Chen,et al.  LABIN: Balanced Min Cut for Large-Scale Data , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[16]  J. B. Rosen,et al.  Lower Dimensional Representation of Text Data Based on Centroids and Least Squares , 2003 .

[17]  Xiaodong Wang,et al.  Adaptive multi-view subspace clustering for high-dimensional data , 2020, Pattern Recognit. Lett..

[18]  Jing Liu,et al.  Clustering-Guided Sparse Structural Learning for Unsupervised Feature Selection , 2014, IEEE Transactions on Knowledge and Data Engineering.

[19]  Feiping Nie,et al.  Multiview Consensus Graph Clustering , 2019, IEEE Transactions on Image Processing.

[20]  Nicu Sebe,et al.  GLocal tells you more: Coupling GLocal structural for feature selection with sparsity for image and video classification , 2014, Comput. Vis. Image Underst..

[21]  Li He,et al.  Kernel K-Means Sampling for Nyström Approximation , 2018, IEEE Transactions on Image Processing.

[22]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[23]  Qinghua Zheng,et al.  An Adaptive Semisupervised Feature Analysis for Video Semantic Recognition , 2018, IEEE Transactions on Cybernetics.

[24]  Terence Sim,et al.  The CMU Pose, Illumination, and Expression Database , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Feiping Nie,et al.  Unsupervised Feature Selection via Unified Trace Ratio Formulation and K-means Clustering (TRACK) , 2014, ECML/PKDD.

[26]  Fei Yan,et al.  Fast Adaptive K-Means Subspace Clustering for High-Dimensional Data , 2019, IEEE Access.

[27]  Feiping Nie,et al.  Unsupervised and semi-supervised learning via ℓ1-norm graph , 2011, 2011 International Conference on Computer Vision.

[28]  Feiping Nie,et al.  Adaptive Local Linear Discriminant Analysis , 2020, ACM Trans. Knowl. Discov. Data.

[29]  Jie Lu,et al.  Concept Drift Detection via Equal Intensity k-Means Space Partitioning , 2020, IEEE Transactions on Cybernetics.

[30]  Feiping Nie,et al.  Re-Weighted Discriminatively Embedded $K$ -Means for Multi-View Clustering , 2017, IEEE Transactions on Image Processing.

[31]  Feiping Nie,et al.  Structured Graph Optimization for Unsupervised Feature Selection , 2019, IEEE Transactions on Knowledge and Data Engineering.

[32]  Rong Wang,et al.  Capped $\ell _p$-Norm LDA for Outliers Robust Dimension Reduction , 2020, IEEE Signal Processing Letters.

[33]  Lin Wang,et al.  Nonnegative low-rank representation based manifold embedding for semi-supervised learning , 2017, Knowl. Based Syst..

[34]  Kun Zhan,et al.  Graph Learning for Multiview Clustering , 2018, IEEE Transactions on Cybernetics.

[35]  Feiping Nie,et al.  Orthogonal vs. uncorrelated least squares discriminant analysis for feature extraction , 2012, Pattern Recognit. Lett..

[36]  Feiping Nie,et al.  Graph Structure Fusion for Multiview Clustering , 2019, IEEE Transactions on Knowledge and Data Engineering.

[37]  Michael J. Lyons,et al.  Automatic Classification of Single Facial Images , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Pasi Fränti,et al.  K-means properties on six clustering benchmark datasets , 2018, Applied Intelligence.

[39]  Lin Zhang,et al.  Discriminative low-rank preserving projection for dimensionality reduction , 2019, Appl. Soft Comput..

[40]  Tao Jiang,et al.  Efficient and robust feature extraction by maximum margin criterion , 2003, IEEE Transactions on Neural Networks.

[41]  Christos Boutsidis,et al.  Unsupervised Feature Selection for the $k$-means Clustering Problem , 2009, NIPS.

[42]  Nenghai Yu,et al.  Annotating personal albums via web mining , 2008, ACM Multimedia.

[43]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[44]  Inderjit S. Dhillon,et al.  Kernel k-means: spectral clustering and normalized cuts , 2004, KDD.

[45]  Yi Yang,et al.  Image Clustering Using Local Discriminant Models and Global Integration , 2010, IEEE Transactions on Image Processing.

[46]  Feiping Nie,et al.  Discriminative Unsupervised Dimensionality Reduction , 2015, IJCAI.

[47]  Feiping Nie,et al.  Discriminative Embedded Clustering: A Framework for Grouping High-Dimensional Data , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[48]  Junwei Han,et al.  Multi-View Scaling Support Vector Machines for Classification and Feature Selection , 2020, IEEE Transactions on Knowledge and Data Engineering.

[49]  Sami Sieranoja,et al.  How much can k-means be improved by using better initialization and repeats? , 2019, Pattern Recognit..

[50]  Ivor W. Tsang,et al.  Flexible Manifold Embedding: A Framework for Semi-Supervised and Unsupervised Dimension Reduction , 2010, IEEE Transactions on Image Processing.