论文信息 - Robust feature selection via simultaneous capped ℓ2-norm and ℓ2,1-norm minimization

Robust feature selection via simultaneous capped ℓ2-norm and ℓ2,1-norm minimization

High dimension is one of the key characters of big data. Feature selection plays a significant role in many machine learning applications dealing with high-dimensional data. To improve the robustness of feature selection, we propose a new robust feature selection method with emphasizing Simultaneous Capped ℓ2-norm loss and ℓ2,1-norm regularizer Minimization (SCM). The capped ℓ2-norm based loss function can effectively eliminate the influence of noise and outliers in regression and the ℓ2,1-norm regularization is used to select features across data sets with joint sparsity. Meanwhile, we propose an effective approach to solve the formulated minimization problem. Experimental studies on real-world data sets demonstrate the effectiveness of our method in comparison with other popular feature selection methods.

[1] Ron Kohavi,et al. Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[2] Igor Kononenko,et al. Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[3] Feiping Nie,et al. Semi-Supervised Feature Selection via Insensitive Sparse Regression with Application to Video Semantic Recognition , 2018, IEEE Transactions on Knowledge and Data Engineering.

[4] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[5] Feiping Nie,et al. Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Exact Top-k Feature Selection via ℓ2,0-Norm Constraint , 2022 .

[6] Xuelong Li,et al. Joint Embedding Learning and Sparse Regression: A Framework for Unsupervised Feature Selection , 2014, IEEE Transactions on Cybernetics.

[7] Feiping Nie,et al. Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[8] Nicu Sebe,et al. Discriminating Joint Feature Analysis for Multimedia Data Understanding , 2012, IEEE Transactions on Multimedia.

[9] Kilian Stoffel,et al. Theoretical Comparison between the Gini Index and Information Gain Criteria , 2004, Annals of Mathematics and Artificial Intelligence.

[10] Chris H. Q. Ding,et al. Towards Structural Sparsity: An Explicit l2/l0 Approach , 2010, ICDM.

[11] S. Billings,et al. Feature Subset Selection and Ranking for Data Dimensionality Reduction , 2007 .

[12] Larry A. Rendell,et al. A Practical Approach to Feature Selection , 1992, ML.

[13] Chris H. Q. Ding,et al. Minimum redundancy feature selection from microarray gene expression data , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[14] Ron Kohavi,et al. Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[15] Feiping Nie,et al. Efficient semi-supervised feature selection with noise insensitive trace ratio criterion , 2013, Neurocomputing.

[16] Massimiliano Pontil,et al. Multi-Task Feature Learning , 2006, NIPS.

[17] Yi Yang,et al. Semisupervised Feature Selection via Spline Regression for Video Semantic Recognition , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[18] Stephen A. Billings,et al. Feature Subset Selection and Ranking for Data Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] P. Langley. Selection of Relevant Features in Machine Learning , 1994 .

[20] Chris H. Q. Ding,et al. Minimum Redundancy Feature Selection from Microarray Gene Expression Data , 2005, J. Bioinform. Comput. Biol..