Feature Subset Selection with Hybrids of Filters and Evolutionary Algorithms

Summary. The performance of classification algorithms is affected by the features used to describe the labeled examples presented to the inducers. Therefore, the problem of feature subset selection has received considerable attention. Approaches to this problem based on evolutionary algorithms (EAs) typically use the wrapper method, treating the inducer as a black box that is used to evaluate candidate feature subsets. However, the evaluations might take a considerable time and the wrapper approach might be impractical for large data sets. Alternative filter methods use heuristics to select feature subsets from the data and are usually considered more scalable than wrappers to the dimensionality and volume of the data. This chapter describes hybrids of evolutionary algorithms (EAs) and filter methods applied to the selection of feature subsets for classification problems. The proposed hybrids were compared against each of their components, two feature selection wrappers that are in wide use, and another filter-wrapper hybrid. The objective of this chapter is to determine if the proposed evolutionary hybrids present advantages over the other methods in terms of accuracy or speed. The experiments used are decision tree and naive Bayes (NB) classifiers on public-domain and artificial data sets. The experimental results suggest that the evolutionary hybrids usually find compact feature subsets that result in the most accurate classifiers, while beating the execution time of the other wrappers.

[1]  Anil K. Jain,et al.  Dimensionality reduction using genetic algorithms , 2000, IEEE Trans. Evol. Comput..

[2]  Julian F. Miller,et al.  Genetic and Evolutionary Computation Conference 2008 : GECCO 2008 , 2008, GECCO 2008.

[3]  Erick Cantú-Paz,et al.  Feature Subset Selection by Estimation of Distribution Algorithms , 2002, GECCO.

[4]  Lawrence Davis,et al.  Hybridizing the Genetic Algorithm and the K Nearest Neighbors Classification Algorithm , 1991, ICGA.

[5]  Kenneth DeJong,et al.  Robust feature selection algorithms , 1993, Proceedings of 1993 IEEE Conference on Tools with Al (TAI-93).

[6]  Erick Cantú-Paz,et al.  Feature Subset Selection, Class Separability, and Genetic Algorithms , 2004, GECCO.

[7]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[8]  Lashon B. Booker,et al.  Proceedings of the fourth international conference on Genetic algorithms , 1991 .

[9]  J. A. Lozano,et al.  Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation , 2001 .

[10]  David E. Goldberg,et al.  Genetic Algorithms, Selection Schemes, and the Varying Effects of Noise , 1996, Evolutionary Computation.

[11]  Juha Reunanen,et al.  Overfitting in Making Comparisons Between Variable Selection Methods , 2003, J. Mach. Learn. Res..

[12]  Patrick K. Simpson,et al.  Dynamic Feature Set Training of Neural Nets for Classification , 1995, Evolutionary Programming.

[13]  Ching Y. Suen,et al.  Analysis of Class Separation and Combination of Class-Dependent Features for Handwriting Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Jack Sklansky,et al.  A note on genetic algorithms for large-scale feature selection , 1989, Pattern Recognit. Lett..

[15]  Heinz Mühlenbein,et al.  The Equation for Response to Selection and Its Use for Prediction , 1997, Evolutionary Computation.

[16]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[17]  Vanderlaan,et al.  EVOLUTIONARY PROGRAMMING IV , 1995 .

[18]  Pier Luca Lanzi,et al.  Fast feature selection with genetic algorithms: a filter approach , 1997, Proceedings of 1997 IEEE International Conference on Evolutionary Computation (ICEC '97).

[19]  Takuji Nishimura,et al.  Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator , 1998, TOMC.

[20]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[21]  Shumeet Baluja,et al.  A Method for Integrating Genetic Search Based Function Optimization and Competitive Learning , 1994 .

[22]  Geoffrey J McLachlan,et al.  Selection bias in gene extraction on the basis of microarray gene-expression data , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Pedro Larrañaga,et al.  Feature Subset Selection by Bayesian network-based optimization , 2000, Artif. Intell..

[24]  Pedro Larrañaga,et al.  Feature Subset Selection by Estimation of Distribution Algorithms , 2002, Estimation of Distribution Algorithms.

[25]  Richard J. Enbody,et al.  Further Research on Feature Selection and Classification Using Genetic Algorithms , 1993, ICGA.

[26]  G. Bortolan,et al.  The problem of linguistic approximation in clinical decision making , 1988, Int. J. Approx. Reason..

[27]  David E. Goldberg,et al.  The compact genetic algorithm , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[28]  Erick Cantú-Paz On the Use of Evolutionary Algorithms in Data Mining , 2002 .

[29]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[30]  Jerzy W. Bala,et al.  Using Learning to Facilitate the Evolution of Features for Recognizing Visual Concepts , 1996, Evolutionary Computation.

[31]  Kristin P. Bennett,et al.  Feature selection for in-silico drug design using genetic algorithms and neural networks , 2001, SMCia/01. Proceedings of the 2001 IEEE Mountain Workshop on Soft Computing in Industrial Applications (Cat. No.01EX504).

[32]  David E. Goldberg,et al.  The Gambler's Ruin Problem, Genetic Algorithms, and the Sizing of Populations , 1999, Evolutionary Computation.

[33]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[34]  Worthy N. Martin,et al.  Genetic Algorithms for Feature Selection for Counterpropagation Networks , 1990 .

[35]  Hussein A. Abbass,et al.  Data Mining: A Heuristic Approach , 2002 .

[36]  Pedro Larrañaga,et al.  Feature subset selection by Bayesian networks: a comparison with genetic and sequential algorithms , 2001, Int. J. Approx. Reason..

[37]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[38]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Ethem Alpaydın,et al.  Combined 5 x 2 cv F Test for Comparing Supervised Classification Learning Algorithms , 1999, Neural Comput..

[40]  Sanmay Das,et al.  Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection , 2001, ICML.

[41]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[42]  Mineichi Kudo,et al.  Comparison of algorithms that select features for pattern classifiers , 2000, Pattern Recognit..