Unsupervised feature selection based on joint spectral learning and general sparse regression

Unsupervised feature selection is an important machine learning task since the manual annotated data are dramatically expensive to obtain and therefore very limited. However, due to the existence of noise and outliers in different data samples, feature selection without the help of discriminant information embedded in the annotated data is quite challenging. To relieve these limitations, we investigate the embedding of spectral learning into a general sparse regression framework for unsupervised feature selection. Generally, the proposed general spectral sparse regression (GSSR) method handles the outlier features by learning the joint sparsity and the noisy features by preserving the local structures of data, jointly. Specifically, GSSR is conducted in two stages. First, the classic sparse dictionary learning method is used to build the bases of original data. After that, the original data are project to the basis space by learning a new representation via GSSR. In GSSR, robust loss function $$\ell _{2,r}{-}{norm}(0<r\le 2)$$ ℓ 2 , r - norm ( 0 < r ≤ 2 ) and $$\ell _{2,p}-{norm}(0<p\le 1)$$ ℓ 2 , p - norm ( 0 < p ≤ 1 ) instead of the traditional F norm and least square loss function are simultaneously considered as the reconstruction term and sparse regularization term for sparse regression. Furthermore, the local topological structures of the new representations are preserved by spectral learning based on the Laplacian term. The overall objective function in GSSR is optimized and proved to be converging. Experimental results on several publicly datasets have demonstrated the validity of our algorithm, which outperformed the state-of-the-art feature selections in terms of classification performance.

[1]  Lin Wu,et al.  Beyond Low-Rank Representations: Orthogonal Clustering Basis Reconstruction with Optimized Graph Structure for Multi-view Spectral Clustering , 2017, Neural Networks.

[2]  Xuelong Li,et al.  Joint Embedding Learning and Sparse Regression: A Framework for Unsupervised Feature Selection , 2014, IEEE Transactions on Cybernetics.

[3]  Jieping Ye,et al.  Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.

[4]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[5]  Shichao Zhang,et al.  Low-Rank Sparse Subspace for Spectral Clustering , 2019, IEEE Transactions on Knowledge and Data Engineering.

[6]  Feiping Nie,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Feature Selection via Joint Embedding Learning and Sparse Regression , 2022 .

[7]  Xiaofeng Zhu,et al.  Unsupervised feature selection via local structure learning and sparse learning , 2017, Multimedia Tools and Applications.

[8]  Lei Wang,et al.  Efficient Spectral Feature Selection with Minimum Redundancy , 2010, AAAI.

[9]  Xuelong Li,et al.  A Unified Learning Framework for Single Image Super-Resolution , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Yong Fan,et al.  Direct Sparsity Optimization Based Feature Selection for Multi-Class Classification , 2016, IJCAI.

[11]  Lin Wu,et al.  Deep adaptive feature embedding with local sample distributions for person re-identification , 2017, Pattern Recognit..

[12]  Richard Jensen,et al.  Measures for Unsupervised Fuzzy-Rough Feature Selection , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.

[13]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[14]  Xiaofeng Zhu,et al.  Local and Global Structure Preservation for Robust Unsupervised Spectral Feature Selection , 2018, IEEE Transactions on Knowledge and Data Engineering.

[15]  Lin Wu,et al.  Robust Subspace Clustering for Multi-View Data by Exploiting Correlation Consensus , 2015, IEEE Transactions on Image Processing.

[16]  Aristidis Likas,et al.  Bayesian feature and model selection for Gaussian mixture models , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Klaus Schöffmann,et al.  Learning laparoscopic video shot classification for gynecological surgery , 2018, Multimedia Tools and Applications.

[18]  Xiaofeng Zhu,et al.  Dynamic graph learning for spectral feature selection , 2018, Multimedia Tools and Applications.

[19]  Xiaofeng Zhu,et al.  Unsupervised feature selection by self-paced learning regularization , 2020, Pattern Recognit. Lett..

[20]  Xian-Sheng Hua,et al.  Ensemble Manifold Regularization , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Deng Cai,et al.  Unsupervised feature selection for multi-cluster data , 2010, KDD.

[22]  Zi Huang,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence ℓ2,1-Norm Regularized Discriminative Feature Selection for Unsupervised Learning , 2022 .

[23]  L. Shao,et al.  From Heuristic Optimization to Dictionary Learning: A Review and Comprehensive Comparison of Image Denoising Algorithms , 2014, IEEE Transactions on Cybernetics.

[24]  Zongben Xu,et al.  Regularization: Convergence of Iterative Half Thresholding Algorithm , 2014 .

[25]  Lin Wu,et al.  Effective Multi-Query Expansions: Collaborative Deep Networks for Robust Landmark Retrieval , 2017, IEEE Transactions on Image Processing.

[26]  Feiping Nie,et al.  Trace Ratio Criterion for Feature Selection , 2008, AAAI.

[27]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[28]  Bo Du,et al.  Ensemble manifold regularized sparse low-rank approximation for multiview feature embedding , 2015, Pattern Recognit..

[29]  Shichao Zhang,et al.  Robust Joint Graph Sparse Coding for Unsupervised Spectral Feature Selection , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[31]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[32]  Rick Chartrand,et al.  Exact Reconstruction of Sparse Signals via Nonconvex Minimization , 2007, IEEE Signal Processing Letters.

[33]  Huan Liu,et al.  Feature Selection for Clustering , 2000, Encyclopedia of Database Systems.

[34]  Yong Fan,et al.  A General Framework for Sparsity Regularized Feature Selection via Iteratively Reweighted Least Square Minimization , 2017, AAAI.

[35]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Richard Jensen,et al.  Measures for Unsupervised Fuzzy-Rough Feature Selection , 2009, ISDA.

[37]  Xindong Wu,et al.  Feature Selection by Joint Graph Sparse Coding , 2013, SDM.

[38]  Xuelong Li,et al.  Graph PCA Hashing for Similarity Search , 2017, IEEE Transactions on Multimedia.

[39]  Lin Wu,et al.  Multiview Spectral Clustering via Structured Low-Rank Matrix Factorization , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[40]  Xiaofeng Zhu,et al.  One-Step Multi-View Spectral Clustering , 2019, IEEE Transactions on Knowledge and Data Engineering.

[41]  Jing Liu,et al.  Unsupervised Feature Selection Using Nonnegative Spectral Analysis , 2012, AAAI.

[42]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[43]  Xiaofeng Zhu,et al.  Efficient kNN Classification With Different Numbers of Nearest Neighbors , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[44]  Jennifer G. Dy Unsupervised Feature Selection , 2007 .