Feature Selection for Multiclass Problems Based on Information Weights

Abstract Before a pattern classifier can be properly designed, it is necessary to consider the feature extraction and data reduction problems. It is evident that the number of features needed to successfully perform a given recognition task depends on the discriminatory qualities of the chosen feature. We propose a new hybrid approach addressing feature selection based on information weights which allows feature categorization on the basis of specified classification task. The purpose is to efficiently achieve high degree of dimensionality reduction and enhance or maintain predictive accuracy with selected features. The novelty is to combine the competitiveness of the filter approach which makes it undependable from the nature of the pattern classifier and embed the algorithm within the pattern classifier structure in order to increase the accuracy of the learning phase as wrapper algorithms do. The algorithm is generalized for multiclass implementation.

[1]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[2]  F. Fleuret Fast Binary Feature Selection with Conditional Mutual Information , 2004, J. Mach. Learn. Res..

[3]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[4]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[5]  Bruce A. Draper,et al.  Feature selection from huge feature sets , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[6]  Huan Liu,et al.  Feature Selection with Selective Sampling , 2002, International Conference on Machine Learning.

[7]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[9]  Sergios Theodoridis,et al.  Pattern Recognition Ed. 4 , 2008 .

[10]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[11]  Filippo Menczer,et al.  Feature selection in unsupervised learning via evolutionary search , 2000, KDD '00.

[12]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[13]  Chong-Ho Choi,et al.  Input feature selection for classification problems , 2002, IEEE Trans. Neural Networks.

[14]  Pablo A. Estévez,et al.  A Niching Genetic Algorithm for Selecting Features for Neural Network Classifiers , 1998 .

[15]  Alan J. Miller Subset Selection in Regression , 1992 .

[16]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..