This paper presents extended Relief algorithms and their use in instance-based feature filtering for document feature selection. The Relief algorithms are general and successful feature estimators that detect conditional dependencies of features between instances, and are applied in the preprocessing step for document classification and regression. Since the introduction the Relief algorithm, many kinds of extended Relief algorithms have been suggested as solutions to problems of redundancy, irrelevant and noisy features as well as Relief algorithm's limitations in two-class and multi-class datasets. In this paper, we introduce additional problems including the negative influence of computation similarities and weights caused by the small number of features in an instance, the absence of nearest Hits or nearest Misses for some instances using Relief algorithms, and other of problems. We suggest new extended Relief algorithms to solve those problems, having in the course of our research, and experimented on the estimation of the quality of features from instances, and classified datasets, and having compared the results of the new extended Relief algorithms. Indeed in the experimental results, the new extended Relief algorithms showed better performances for all of the datasets than did the Relief algorithms
[1]
Antonio Arauzo-Azofra,et al.
A feature set measure based on Relief
,
2004
.
[2]
B. Raman,et al.
Instance Based Filter for Feature Selection
,
2002
.
[3]
Igor Kononenko,et al.
Estimating Attributes: Analysis and Extensions of RELIEF
,
1994,
ECML.
[4]
Guy W. Mineau,et al.
A simple KNN algorithm for text categorization
,
2001,
Proceedings 2001 IEEE International Conference on Data Mining.
[5]
Gang Wang,et al.
Feature selection with conditional mutual information maximin in text categorization
,
2004,
CIKM '04.
[6]
Igor Kononenko,et al.
ReliefF for estimation and discretization of attributes in classification, regression, and ILP probl
,
1996
.
[7]
Yiming Yang,et al.
A Comparative Study on Feature Selection in Text Categorization
,
1997,
ICML.
[8]
Marko Robnik-Sikonja,et al.
Theoretical and Empirical Analysis of ReliefF and RReliefF
,
2003,
Machine Learning.
[9]
Larry A. Rendell,et al.
A Practical Approach to Feature Selection
,
1992,
ML.
[10]
Jian Li,et al.
Iterative RELIEF for feature weighting
,
2006,
ICML.