Feature selection in data mining

Feature subset selection is an important problem in knowledge discovery, not only for the insight gained from determining relevant modeling variables, but also for the improved understandability, scalability, and, possibly, accuracy of the resulting models. The purpose of this chapter is to provide a comprehensive analysis of feature selection via evolutionary search in supervised and unsupervised learning. To achieve this purpose, we first discuss a general framework for feature selection based on a new search algorithm, Evolutionary Local Selection Algorithm (ELSA). The search is formulated as a multi-objective optimization problem to examine the trade-off between the complexity of the generated solutions against their quality. ELSA considers multiple objectives efficiently while avoiding computationally expensive global comparison. We combine ELSA with Artificial Neural Networks (ANNs) and Expectation-Maximization (EM) algorithms for feature selection in supervised and unsupervised learning respectively. Further, we provide a new two-level evolutionary algorithm, Meta-Evolutionary Ensembles (MEE), where feature selection is used to promote the diversity among classifiers in the same ensemble.

[1]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[2]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[3]  Ashwin Ram,et al.  Efficient Feature Selection in Conceptual Clustering , 1997, ICML.

[4]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.

[5]  Filippo Menczer,et al.  From Complex Environments to Complex Behaviors , 1996, Adapt. Behav..

[6]  Padraig Cunningham,et al.  Diversity versus Quality in Classification Ensembles Based on Feature Selection , 2000, ECML.

[7]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[8]  U. Fayyad,et al.  Scaling EM (Expectation Maximization) Clustering to Large Databases , 1998 .

[9]  L. Darrell Whitley,et al.  Genetic Approach to Feature Selection for Ensemble Creation , 1999, GECCO.

[10]  David E. Goldberg,et al.  Genetic Algorithms with Sharing for Multimodalfunction Optimization , 1987, ICGA.

[11]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[12]  Carla E. Brodley,et al.  Feature Subset Selection and Order Identification for Unsupervised Learning , 2000, ICML.

[13]  Melania Degeratu,et al.  1 Evolving Heterogeneous Neural Agents by Local Selection , 1999 .

[14]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Stephen F. Smith,et al.  Non-Standard Crossover for a Standard Representation - Commonality-Based Feature Subset Selection , 1999, GECCO.

[16]  William Nick Street,et al.  An Inductive Learning Approach to Prognostic Prediction , 1995, ICML.

[17]  Peter C. Cheeseman,et al.  Bayesian Classification (AutoClass): Theory and Results , 1996, Advances in Knowledge Discovery and Data Mining.

[18]  Filippo Menczer,et al.  Efficient and Scalable Pareto Optimization by Evolutionary Local Selection Algorithms , 2000, Evolutionary Computation.

[19]  Carla E. Brodley,et al.  Visualization and interactive feature selection for unsupervised data , 2000, KDD '00.

[20]  William Nick Street,et al.  Breast Cancer Diagnosis and Prognosis Via Linear Programming , 1995, Oper. Res..

[21]  Gary B. Lamont,et al.  Multiobjective evolutionary algorithms: classifications, analyses, and new innovations , 1999 .

[22]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .