A Comprehensive Comparison on Evolutionary Feature Selection Approaches to Classification

Feature selection is an important data preprocessing step in machine learning and data mining, such as classification tasks. Research on feature selection has been extensively conducted for more than 50 years and different types of approaches have been proposed, which include wrapper approaches or filter approaches, and single objective approaches or multi-objective approaches. However, the advantages and disadvantages of such approaches have not been thoroughly investigated. This paper provides a comprehensive study on comparing different types of feature selection approaches, specifically including comparisons on the classification performance and computational time of wrappers and filters, generality of wrapper approaches, and comparisons on single objective and multi-objective approaches. Particle swarm optimization (PSO)-based approaches, which include different types of methods, are used as typical examples to conduct this research. A total of 10 different feature selection methods and over 7000 experiments are involved. The results show that filters are usually faster than wrappers, but wrappers using a simple classification algorithm can be faster than filters. Wrappers often achieve better classification performance than filters. Feature subsets obtained from wrappers can be general to other classification algorithms. Meanwhile, multi-objective approaches are generally better choices than single objective algorithms. The findings are not only useful for researchers to develop new approaches to addressing new challenges in feature selection, but also useful for real-world decision makers to choose a specific feature selection method according to their own requirements.

[1]  Geoffrey J McLachlan,et al.  Selection bias in gene extraction on the basis of microarray gene-expression data , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Mengjie Zhang,et al.  Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms , 2014, Appl. Soft Comput..

[3]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[4]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[5]  Mohd Saberi Mohamad,et al.  A Modified Binary Particle Swarm Optimization for Selecting the Small Subset of Informative Genes From Gene Expression Data , 2011, IEEE Transactions on Information Technology in Biomedicine.

[6]  Harry Zhang,et al.  A Fast Decision Tree Learning Algorithm , 2006, AAAI.

[7]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[8]  Witold Pedrycz,et al.  Feature and instance selection via cooperative PSO , 2011, 2011 IEEE International Conference on Systems, Man, and Cybernetics.

[9]  Mengjie Zhang,et al.  A Dimension Reduction Approach to Classification Based on Particle Swarm Optimisation and Rough Set Theory , 2012, Australasian Conference on Artificial Intelligence.

[10]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[11]  Mengjie Zhang,et al.  Multi-objective Evolutionary Algorithms for filter Based Feature Selection in Classification , 2013, Int. J. Artif. Intell. Tools.

[12]  Xiaoyan Sun,et al.  Multi-objective PSO Algorithm for Feature Selection Problems with Unreliable Data , 2014, ICSI.

[13]  Yue Shi,et al.  A modified particle swarm optimizer , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[14]  Yvan Saeys,et al.  Java-ML: A Machine Learning Library , 2009, J. Mach. Learn. Res..

[15]  Mengjie Zhang,et al.  A Particle Swarm Optimisation Based Multi-objective Filter Approach to Feature Selection for Classification , 2012, PRICAI.

[16]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[17]  Mengjie Zhang,et al.  Binary particle swarm optimisation for feature selection: A filter based approach , 2012, 2012 IEEE Congress on Evolutionary Computation.

[18]  Mengjie Zhang,et al.  A multi-objective particle swarm optimisation for filter-based feature selection in classification problems , 2012, Connect. Sci..

[19]  Mengjie Zhang,et al.  Multi-objective particle swarm optimisation (PSO) for feature selection , 2012, GECCO '12.

[20]  Mengjie Zhang,et al.  Binary PSO and Rough Set Theory for Feature Selection: a Multi-objective filter Based Approach , 2014, Int. J. Comput. Intell. Appl..

[21]  Wei-Chang Yeh,et al.  Feature selection with Intelligent Dynamic Swarm and Rough Set , 2010, Expert Syst. Appl..

[22]  Anirban Mukhopadhyay,et al.  A Graph-Theoretic Approach for Identifying Non-Redundant and Relevant Gene Markers from Microarray Data Using Multiobjective Binary PSO , 2014, PloS one.

[23]  Mengjie Zhang,et al.  New fitness functions in binary particle swarm optimisation for feature selection , 2012, 2012 IEEE Congress on Evolutionary Computation.

[24]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[25]  Oguz Findik,et al.  A comparison of feature selection models utilizing binary particle swarm optimization and genetic algorithm in determining coronary artery disease using support vector machine , 2010, Expert Syst. Appl..

[26]  Maurice Clerc,et al.  The particle swarm - explosion, stability, and convergence in a multidimensional complex space , 2002, IEEE Trans. Evol. Comput..

[27]  Margaret L. Brandeau,et al.  Expanded HIV Testing in Low-Prevalence, High-Income Countries: A Cost-Effectiveness Analysis for the United Kingdom , 2014, PloS one.

[28]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[29]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Mengjie Zhang,et al.  Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach , 2013, IEEE Transactions on Cybernetics.

[31]  C. Elkan Nearest Neighbor Classification , 2007 .

[32]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[33]  Mengjie Zhang,et al.  Novel Initialisation and Updating Mechanisms in PSO for Feature Selection in Classification , 2013, EvoApplications.

[34]  Yiyu Yao,et al.  Attribute reduction in decision-theoretic rough set models , 2008, Inf. Sci..

[35]  Alper Ekrem Murat,et al.  A discrete particle swarm optimization method for feature selection in binary classification problems , 2010, Eur. J. Oper. Res..