Self-Weighted Supervised Discriminative Feature Selection

In this brief, a novel self-weighted orthogonal linear discriminant analysis (SOLDA) problem is proposed, and a self-weighted supervised discriminative feature selection (SSD-FS) method is derived by introducing sparsity-inducing regularization to the proposed SOLDA problem. By using the row-sparse projection, the proposed SSD-FS method is superior to multiple sparse feature selection approaches, which can overly suppress the nonzero rows such that the associated features are insufficient for selection. More specifically, the orthogonal constraint ensures the minimal number of selectable features for the proposed SSD-FS method. In addition, the proposed feature selection method is able to harness the discriminant power such that the discriminative features are selected. Consequently, the effectiveness of the proposed SSD-FS method is validated theoretically and experimentally.

[1]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[2]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[3]  Feiping Nie,et al.  Trace Ratio Criterion for Feature Selection , 2008, AAAI.

[4]  Xuelong Li,et al.  Joint Embedding Learning and Sparse Regression: A Framework for Unsupervised Feature Selection , 2014, IEEE Transactions on Cybernetics.

[5]  Feiping Nie,et al.  Trace Ratio Problem Revisited , 2009, IEEE Transactions on Neural Networks.

[6]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[7]  Feiping Nie,et al.  Discriminative Least Squares Regression for Multiclass Classification and Feature Selection , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Tao Jiang,et al.  Efficient and robust feature extraction by maximum margin criterion , 2003, IEEE Transactions on Neural Networks.

[9]  Zi Huang,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence ℓ2,1-Norm Regularized Discriminative Feature Selection for Unsupervised Learning , 2022 .

[10]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[11]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Shan Suthaharan,et al.  Support Vector Machine , 2016 .

[13]  William Stafford Noble,et al.  Support vector machine , 2013 .

[14]  Tieniu Tan,et al.  l2, 1 Regularized correntropy for robust feature selection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Kilian Stoffel,et al.  Theoretical Comparison between the Gini Index and Information Gain Criteria , 2004, Annals of Mathematics and Artificial Intelligence.

[16]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[17]  Deng Cai,et al.  Unsupervised feature selection for multi-cluster data , 2010, KDD.

[18]  Oliver Kramer,et al.  Dimensionality Reduction with Unsupervised Nearest Neighbors , 2013, Intelligent Systems Reference Library.

[19]  Feiping Nie,et al.  Unsupervised Feature Selection via Unified Trace Ratio Formulation and K-means Clustering (TRACK) , 2014, ECML/PKDD.

[20]  Feiping Nie,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Exact Top-k Feature Selection via ℓ2,0-Norm Constraint , 2022 .