Towards a Memetic Feature Selection Paradigm [Application Notes]

Feature selection has become the focus of many real-world application oriented developments and applied research in recent years. With the rapid advancement of computer and database technologies, problems "with hundreds and thousands of variables or features are now ubiquitous in pattern recognition, data mining, and machine learning [1], [2]. In this article, we consider two real-world feature selection applications: gene selection in cancer classification based on microarray data and band selection for pixel classification using hyperspectral imagery data.

[1]  Thomas A. Darden,et al.  Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method , 2001, Bioinform..

[2]  Anil K. Jain,et al.  Dimensionality reduction using genetic algorithms , 2000, IEEE Trans. Evol. Comput..

[3]  T. Poggio,et al.  Prediction of central nervous system embryonal tumour outcome based on gene expression , 2002, Nature.

[4]  Anuj Srivastava,et al.  A Bayesian MRF framework for labeling terrain using hyperspectral imaging , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[5]  Andy J. Keane,et al.  Meta-Lamarckian learning in memetic algorithms , 2004, IEEE Transactions on Evolutionary Computation.

[6]  Xuefeng Bruce Ling,et al.  Multiclass cancer classification and biomarker discovery using GA-based algorithms , 2005, Bioinform..

[7]  E. Petricoin,et al.  Use of proteomic patterns in serum to identify ovarian cancer , 2002, The Lancet.

[8]  Edoardo Amaldi,et al.  On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems , 1998, Theor. Comput. Sci..

[9]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[10]  Huan Liu,et al.  A Probabilistic Approach to Feature Selection - A Filter Solution , 1996, ICML.

[11]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[12]  Hisao Ishibuchi,et al.  Balance between genetic search and local search in memetic algorithms for multiobjective permutation flowshop scheduling , 2003, IEEE Trans. Evol. Comput..

[13]  Tao Li,et al.  A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression , 2004, Bioinform..

[14]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[15]  James E. Baker,et al.  Adaptive Selection Methods for Genetic Algorithms , 1985, International Conference on Genetic Algorithms.

[16]  S. Ramaswamy,et al.  Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. , 2002, Cancer research.

[17]  Yew-Soon Ong,et al.  A Probabilistic Memetic Framework , 2009, IEEE Transactions on Evolutionary Computation.

[18]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[19]  Jim E. Smith,et al.  Coevolving Memetic Algorithms: A Review and Progress Report , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[20]  Shigeru Obayashi,et al.  Development and investigation of efficient GA/PSO-hybrid algorithm applicable to real-world design optimization , 2009, 2009 IEEE Congress on Evolutionary Computation.

[21]  Xin Yao,et al.  A Memetic Algorithm for VLSI Floorplanning , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[22]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[23]  Edward R. Dougherty,et al.  Is cross-validation valid for small-sample microarray classification? , 2004, Bioinform..

[24]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[25]  David E. Goldberg,et al.  Improving the efficiency of the extended compact genetic algorithm , 2008, GECCO '08.

[26]  Lakhmi C. Jain,et al.  Evolutionary Multiobjective Optimization , 2005, Evolutionary Multiobjective Optimization.

[27]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[28]  Zhen Ji,et al.  Band Selection for Hyperspectral Imagery Using Affinity Propagation , 2008, 2008 Digital Image Computing: Techniques and Applications.

[29]  Chou-Yuan Lee,et al.  A hybrid search algorithm with heuristics for resource allocation problem , 2005, Inf. Sci..

[30]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[31]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[32]  Natalio Krasnogor,et al.  Studies on the theory and design space of memetic algorithms , 2002 .

[33]  Patrick Tan,et al.  Genetic algorithms applied to multi-class prediction for the analysis of gene expression data , 2003, Bioinform..

[34]  Natalio Krasnogor,et al.  Adaptive Cellular Memetic Algorithms , 2009, Evolutionary Computation.

[35]  Zexuan Zhu,et al.  Markov blanket-embedded genetic algorithm for gene selection , 2007, Pattern Recognit..

[36]  Lorenzo Bruzzone,et al.  Classification of hyperspectral remote sensing images with support vector machines , 2004, IEEE Transactions on Geoscience and Remote Sensing.

[37]  S. Chen,et al.  Fast and accurate feature selection using hybrid genetic strategies , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[38]  Yanchun Liang,et al.  Clonal Selection Based Memetic Algorithm for Job Shop Scheduling Problems , 2008 .

[39]  E. Lander,et al.  MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia , 2002, Nature Genetics.

[40]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[41]  Werner Dubitzky,et al.  A Practical Approach to Microarray Data Analysis , 2003, Springer US.

[42]  Zexuan Zhu,et al.  Wrapper–Filter Feature Selection Algorithm Using a Memetic Framework , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[43]  Kwang-Ting Cheng,et al.  Fundamentals of algorithms , 2009 .

[44]  Hitoshi Iba,et al.  The Memetic Tree-based Genetic Algorithm and its application to Portfolio Optimization , 2009, Memetic Comput..

[45]  Byung Ro Moon,et al.  Hybrid Genetic Algorithms for Feature Selection , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[46]  Thomas G. Dietterich,et al.  Learning Boolean Concepts in the Presence of Many Irrelevant Features , 1994, Artif. Intell..

[47]  Ruhul A. Sarker,et al.  Memetic algorithms for solving job-shop scheduling problems , 2009, Memetic Comput..

[48]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[49]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[50]  Bo Liu,et al.  An Effective PSO-Based Memetic Algorithm for Flow Shop Scheduling , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[51]  Kalyanmoy Deb,et al.  Evolutionary multiobjective optimization , 2007, GECCO '07.

[52]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[53]  Xin Yao,et al.  Memetic Algorithm With Extended Neighborhood Search for Capacitated Arc Routing Problems , 2009, IEEE Transactions on Evolutionary Computation.

[54]  Geoffrey J McLachlan,et al.  Selection bias in gene extraction on the basis of microarray gene-expression data , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[55]  T. Golub,et al.  Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. , 2003, Cancer research.

[56]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[57]  Harold Soh,et al.  Discovering Unique, Low-Energy Pure Water Isomers: Memetic Exploration, Optimization, and Landscape Analysis , 2010, IEEE Transactions on Evolutionary Computation.

[58]  Lorenzo Bruzzone,et al.  A Novel Approach to the Selection of Spatially Invariant Features for the Classification of Hyperspectral Images With Improved Generalization Capability , 2009, IEEE Transactions on Geoscience and Remote Sensing.

[59]  S. Ishii,et al.  A multi-class predictor based on a probabilistic model: application to gene expression profiling-based diagnosis of thyroid tumors , 2006, BMC Genomics.

[60]  Gary G. Yen Evolutionary multiobjective optimization [Editor's remarks] , 2009, IEEE Comput. Intell. Mag..

[61]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[62]  Andrew Adamatzky,et al.  Genetic approaches to search for computing patterns in cellular automata , 2009, IEEE Computational Intelligence Magazine.

[63]  Shoichi Hasegawa,et al.  Development and investigation of efficient GA/PSO-HYBRID algorithm applicable to real-world design optimization , 2009, IEEE Comput. Intell. Mag..

[64]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[65]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[66]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[67]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[68]  Kevin Kok Wai Wong,et al.  Classification of adaptive memetic algorithms: a comparative study , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[69]  R Kahavi,et al.  Wrapper for feature subset selection , 1997 .