An Adaption of Relief for Redundant Feature Elimination

Feature selection is important for many learning problems improving speed and quality. Main approaches include individual evaluation and subset evaluation methods. Individual evaluation methods, such as Relief, are efficient but can not detect redundant features, which limits the applications. A new feature selection algorithm removing both irrelevant and redundant features is proposed based on the basic idea of Relief. For each feature, not only effectiveness is evaluated, but also informativeness is considered according to the performance of other features. Experiments on bench mark datasets show that the new algorithm can removing both irrelevant and redundant features and keep the efficiency like a individual evaluation method.

[1]  Jason H. Moore,et al.  Tuning ReliefF for Genome-Wide Genetic Analysis , 2007, EvoBIO.

[2]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[3]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[4]  C. Ding,et al.  Gene selection algorithm by combining reliefF and mRMR , 2007, 2007 IEEE 7th International Symposium on BioInformatics and BioEngineering.

[5]  Wenqian Shang,et al.  A novel feature selection algorithm for text categorization , 2007, Expert Syst. Appl..

[6]  Gary Geunbae Lee,et al.  Information gain and divergence-based feature selection for machine learning-based text categorization , 2006, Inf. Process. Manag..

[7]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[8]  Marko Robnik-Sikonja,et al.  An adaptation of Relief for attribute estimation in regression , 1997, ICML.

[9]  Igor Kononenko,et al.  On Biases in Estimating Multi-Valued Attributes , 1995, IJCAI.

[10]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[11]  Luc De Raedt,et al.  Machine Learning: ECML-94 , 1994, Lecture Notes in Computer Science.

[12]  Thomas G. Dietterich,et al.  Learning Boolean Concepts in the Presence of Many Irrelevant Features , 1994, Artif. Intell..

[13]  Huan Liu,et al.  A Probabilistic Approach to Feature Selection - A Filter Solution , 1996, ICML.

[14]  Marko Robnik-Sikonja,et al.  Overcoming the Myopia of Inductive Learning Algorithms with RELIEFF , 2004, Applied Intelligence.

[15]  David B. Allison,et al.  Database mining for selection of SNP markers useful in admixture mapping , 2009, BioData Mining.

[16]  Jason H. Moore,et al.  Spatially Uniform ReliefF (SURF) for computationally-efficient filtering of gene-gene interactions , 2009, BioData Mining.

[17]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[18]  W. Rubinstein,et al.  Genome-wide analysis of antisense transcription with Affymetrix exon array , 2008, BMC Genomics.

[19]  I. Kononenko,et al.  INDUCTION OF DECISION TREES USING RELIEFF , 1995 .

[20]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[21]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[22]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[23]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .