Feature Selection Using Single/Multi-Objective Memetic Frameworks

Memetic frameworks for the hybridization of wrapper and filter feature selection methods have been proposed for classification problems. The frameworks incorporate filter methods in the traditional evolutionary algorithms to improve classification performance while accelerating the search in the identification of crucial feature subsets. Filter methods are introduced as local learning procedures in the evolutionary search to add or delete features from the chromosome which encodes the selected feature subset. Both single/multi-objective memetic frameworks are described in this chapter. Single objective memetic framework is shown to speedup the identification of optimal feature subset while at the same time maintaining good prediction accuracy. Subsequently, the multiobjective memetic framework extends the notion of optimal feature subset as the simultaneous identification of full class relevant (FCR) and partial class relevant (PCR) features in multiclass problems. Comparison study to existing state-of-the-art filter and wrapper methods, and the standard genetic algorithm highlights the efficacy of the memetic framework in facilitating a good compromise of the classification accuracy and selected feature size on binary and multi class problems.

[1]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[2]  Alexander Graham,et al.  Kronecker Products and Matrix Calculus: With Applications , 1981 .

[3]  J. E. Baker Adaptive Selection Methods for Genetic Algorithms , 1985, ICGA.

[4]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[5]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[6]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[7]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[8]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[9]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[10]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[11]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[12]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[13]  R Kahavi,et al.  Wrapper for feature subset selection , 1997 .

[14]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[15]  Hisao Ishibuchi,et al.  A multi-objective genetic local search algorithm and its application to flowshop scheduling , 1998, IEEE Trans. Syst. Man Cybern. Part C.

[16]  Hiroshi Motoda,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998, The Springer International Series in Engineering and Computer Science.

[17]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[18]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[19]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[20]  Anil K. Jain,et al.  Dimensionality reduction using genetic algorithms , 2000, IEEE Trans. Evol. Comput..

[21]  Thomas A. Darden,et al.  Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method , 2001, Bioinform..

[22]  Natalio Krasnogor,et al.  Studies on the theory and design space of memetic algorithms , 2002 .

[23]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[24]  T. Darden,et al.  Computational Analysis of Leukemia Microarray Expression Data Using the GA/KNN Method , 2002 .

[25]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[26]  William H. Press,et al.  Numerical recipes in C , 2002 .

[27]  Alexey Tsymbal,et al.  Ensemble feature selection with the simple Bayesian classification , 2003, Inf. Fusion.

[28]  Hisao Ishibuchi,et al.  Balance between genetic search and local search in memetic algorithms for multiobjective permutation flowshop scheduling , 2003, IEEE Trans. Evol. Comput..

[29]  J. M. Deutsch,et al.  Evolutionary algorithms for finding optimal gene sets in microarray prediction , 2003, Bioinform..

[30]  Huan Liu,et al.  Consistency-based search in feature selection , 2003, Artif. Intell..

[31]  Andy J. Keane,et al.  Meta-Lamarckian learning in memetic algorithms , 2004, IEEE Transactions on Evolutionary Computation.

[32]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[33]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[34]  Edward R. Dougherty,et al.  Is cross-validation valid for small-sample microarray classification? , 2004, Bioinform..

[35]  Gus L. W. Hart,et al.  Using genetic algorithms to map first-principles results to model Hamiltonians: Application to the generalized Ising model for alloys , 2005 .

[36]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[37]  Gus L. W. Hart,et al.  Evolutionary approach for determining first-principles hamiltonians , 2005, Nature materials.

[38]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[39]  Huan Liu,et al.  Redundancy-based feature selection for high-dimensional data and application in bioinformatics , 2005 .

[40]  Constantin F. Aliferis,et al.  A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis , 2004, Bioinform..

[41]  Jesús S. Aguilar-Ruiz,et al.  Incremental wrapper-based gene selection from microarray data for cancer classification , 2006, Pattern Recognit..

[42]  Luiz Eduardo Soares de Oliveira,et al.  Feature Selection for Ensembles Using the Multi-Objective Optimization Approach , 2006, Multi-Objective Machine Learning.

[43]  Zexuan Zhu,et al.  Wrapper–Filter Feature Selection Algorithm Using a Memetic Framework , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[44]  Zexuan Zhu,et al.  Markov blanket-embedded genetic algorithm for gene selection , 2007, Pattern Recognit..

[45]  J. Zurada,et al.  Identification of Full and Partial Class Relevant Genes , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.