The problem of feature subset selection can be defined as the selection of a relevant subset of features which allows a learning algorithm to induce small high-accuracy models. This problem is of primary important because irrelevant and redundant features may degrade the learner speed, especially in the context of high dimensionality, and reduce both the accuracy and comprehensibility of the induced model. Two main approaches have been developed, the first one is algorithm-independent (filter approach) which considers only the data, when the second approach which is algorithm-dependent takes into account both the data and a given learning algorithm (wrapper approach). Recent work was developed to study the interest of the rough set theory and more particularly its notions of reducts and core to deal with the problem of feature subset selection. Different methods were proposed to select features using both the core and the reduct concepts, whereas other researches show that useful feature subsets do not necessarily contain all features in cores. In this paper, we underline the fact that rough set theory is concerned with deterministic analysis of attribute dependencies which are at the basis of the two notions of reduct and core. We extend the notion of dependency which allows to find both deterministic and non-deterministic dependencies. A new notion of strong reducts is then introduced and leads to the definition of strong feature subsets (SFS). The interest of SFS is illustrated by the improvement of the accuracy of C4.5 on real-world datasets. Our study shows that generally the highest-accuracy-subset is not the best one as regards to the filter criteria. The highest accuracy subset is found by the new approach with minimum cost. The contribution of this work is four folds : (1) analysis of feature subset selection in the rough sets context, (2) introduction of new definitions based on a generalized rough set theory, i.e., \alpha-RST, (3) reformulation of the selection problem, (4) description of a hybrid method combining combining both the filter and the wrapper approaches.
[1]
Wojciech Ziarko,et al.
Variable Precision Rough Set Model
,
1993,
J. Comput. Syst. Sci..
[2]
Zdzisław Pawlak,et al.
Rough sets. Present state and the future
,
1993
.
[3]
Huan Liu,et al.
A Probabilistic Approach to Feature Selection - A Filter Solution
,
1996,
ICML.
[4]
Jerzy W. Grzymala-Busse,et al.
Rough Sets
,
1995,
Commun. ACM.
[5]
Mohamed Quafafou,et al.
alpha-RST: a generalization of rough set theory
,
2000,
Inf. Sci..
[6]
Thomas G. Dietterich,et al.
Learning Boolean Concepts in the Presence of Many Irrelevant Features
,
1994,
Artif. Intell..
[7]
Ron Kohavi,et al.
Feature Subset Selection Using the Wrapper Method: Overfitting and Dynamic Search Space Topology
,
1995,
KDD.
[8]
Rich Caruana,et al.
Greedy Attribute Selection
,
1994,
ICML.
[9]
Igor Kononenko,et al.
On Biases in Estimating Multi-Valued Attributes
,
1995,
IJCAI.
[10]
Z. Pawlak.
Rough Sets: Theoretical Aspects of Reasoning about Data
,
1991
.
[11]
Usama M. Fayyad,et al.
Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning
,
1993,
IJCAI.
[12]
Maciej Modrzejewski,et al.
Feature Selection Using Rough Sets Theory
,
1993,
ECML.
[13]
Roman Slowinski,et al.
Sensitivity Analysis of Rough Classification
,
1990,
Int. J. Man Mach. Stud..
[14]
Larry A. Rendell,et al.
The Feature Selection Problem: Traditional Methods and a New Algorithm
,
1992,
AAAI.
[15]
Mohamed Quafafou,et al.
Induction of Strong Feature Subsets
,
1997,
PKDD.
[16]
Kenneth DeJong,et al.
Robust feature selection algorithms
,
1993,
Proceedings of 1993 IEEE Conference on Tools with Al (TAI-93).
[17]
Ron Kohavi,et al.
Irrelevant Features and the Subset Selection Problem
,
1994,
ICML.
[18]
Huan Liu,et al.
Feature Selection and Classification - A Probabilistic Wrapper Approach
,
1996,
IEA/AIE.
[19]
Vijay V. Raghavan,et al.
Exploiting Upper Approximation in the Rough Set Methodology
,
1995,
KDD.