Self-weighted discriminative feature selection via adaptive redundancy minimization

Abstract In this paper, a novel self-weighted orthogonal linear discriminant analysis (SOLDA) method is firstly proposed, such that optimal weight can be automatically achieved to balance both between-class and within-class scatter matrices. Since correlated features tend to have similar rankings, multiple adopted criteria might lead to the state that top ranked features are selected with large correlations, such that redundant information is brought about. To minimize associated redundancy, an original regularization term is introduced to the proposed SOLDA problem to penalize the high-correlated features. Different from other methods and techniques, we optimize redundancy matrix as a variable instead of setting it as a priori, such that correlations among all the features can be adaptively evaluated. Additionally, a brand new recursive method is derived to achieve the selection matrix heuristically, such that closed form solution can be obtained with holding the orthogonality. Consequently, self-weighted discriminative feature selection via adaptive redundancy minimization (SDFS-ARM) method can be summarized, such that non-redundant discriminative features could be selected correspondingly. Eventually, the effectiveness of the proposed SDFS-ARM method is further validated by the empirical results.

[1]  Xuelong Li,et al.  Unsupervised Feature Selection with Structured Graph Optimization , 2016, AAAI.

[2]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Oliver Kramer,et al.  Dimensionality Reduction with Unsupervised Nearest Neighbors , 2013, Intelligent Systems Reference Library.

[4]  Tao Jiang,et al.  Efficient and robust feature extraction by maximum margin criterion , 2003, IEEE Transactions on Neural Networks.

[5]  Jennifer L. Davidson,et al.  Feature selection for steganalysis using the Mahalanobis distance , 2010, Electronic Imaging.

[6]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[7]  Zenglin Xu,et al.  Discriminative Semi-Supervised Feature Selection Via Manifold Regularization , 2009, IEEE Transactions on Neural Networks.

[8]  Feiping Nie,et al.  Effective Discriminative Feature Selection With Nontrivial Solution , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[9]  Huan Liu,et al.  Embedded Unsupervised Feature Selection , 2015, AAAI.

[10]  Jennifer G. Dy,et al.  From Transformation-Based Dimensionality Reduction to Feature Selection , 2010, ICML.

[11]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[12]  Feiping Nie,et al.  Self-Weighted Supervised Discriminative Feature Selection , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[13]  Feiping Nie,et al.  Unsupervised Feature Selection via Unified Trace Ratio Formulation and K-means Clustering (TRACK) , 2014, ECML/PKDD.

[14]  Feiping Nie,et al.  Trace Ratio Problem Revisited , 2009, IEEE Transactions on Neural Networks.

[15]  Shan Suthaharan,et al.  Support Vector Machine , 2016 .

[16]  Gérard Dreyfus,et al.  Ranking a Random Feature for Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[17]  Tieniu Tan,et al.  l2, 1 Regularized correntropy for robust feature selection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Feiping Nie,et al.  Feature Selection via Global Redundancy Minimization , 2015, IEEE Transactions on Knowledge and Data Engineering.

[19]  Feiping Nie,et al.  Trace Ratio Criterion for Feature Selection , 2008, AAAI.

[20]  Xuelong Li,et al.  Joint Embedding Learning and Sparse Regression: A Framework for Unsupervised Feature Selection , 2014, IEEE Transactions on Cybernetics.

[21]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[22]  Feiping Nie,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Exact Top-k Feature Selection via ℓ2,0-Norm Constraint , 2022 .

[23]  Dit-Yan Yeung,et al.  A Regularization Approach to Learning Task Relationships in Multitask Learning , 2014, ACM Trans. Knowl. Discov. Data.