Feature Selection with Integrated Relevance and Redundancy Optimization

The task of feature selection is to select a subset of the original features according to certain predefined criterion with the goal to remove irrelevant and redundant features, improve the prediction performance and reduce the computational costs of data mining algorithms. In this paper, we integrate feature relevance and redundancy explicitly in the feature selection criterion. Spectral feature analysis is applied here which can fit into both supervised and unsupervised learning problems. Specifically, we formulate the problem into a combinatorial problem to maximize the relevance and minimize the redundancy of the selected subset of features at the same time. The problem can be relaxed and solved with an efficient extended power method with global convergence guaranteed. Extensive experiments demonstrate the advantages of the proposed technique in terms of improving the prediction performance and reducing redundancy in data.

[1]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[2]  Lei Wang,et al.  Feature Selection With Redundancy-Constrained Class Separability , 2010, IEEE Transactions on Neural Networks.

[3]  Dale Schuurmans,et al.  Fast normalized cut with linear constraints , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[5]  Yi Jiang,et al.  Eigenvalue Sensitive Feature Selection , 2011, ICML.

[6]  Feiping Nie,et al.  Trace Ratio Criterion for Feature Selection , 2008, AAAI.

[7]  Feiping Nie,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Feature Selection via Joint Embedding Learning and Sparse Regression , 2022 .

[8]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[9]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Lei Wang,et al.  Efficient Spectral Feature Selection with Minimum Redundancy , 2010, AAAI.

[11]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[12]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[13]  Chong-Ho Choi,et al.  Input Feature Selection by Mutual Information Based on Parzen Window , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[15]  J. Rodgers,et al.  Thirteen ways to look at the correlation coefficient , 1988 .

[16]  J. Moody,et al.  Feature Selection Based on Joint Mutual Information , 1999 .

[17]  Charles Elkan,et al.  Quadratic Programming Feature Selection , 2010, J. Mach. Learn. Res..

[18]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[19]  Deng Cai,et al.  Unsupervised feature selection for multi-cluster data , 2010, KDD.

[20]  Huan Liu,et al.  Spectral feature selection for supervised and unsupervised learning , 2007, ICML '07.