A tribe competition-based genetic algorithm for feature selection in pattern classification

Abstract Feature selection has always been a critical step in pattern recognition, in which evolutionary algorithms, such as the genetic algorithm (GA), are most commonly used. However, the individual encoding scheme used in various GAs would either pose a bias on the solution or require a pre-specified number of features, and hence may lead to less accurate results. In this paper, a tribe competition-based genetic algorithm (TCbGA) is proposed for feature selection in pattern classification. The population of individuals is divided into multiple tribes, and the initialization and evolutionary operations are modified to ensure that the number of selected features in each tribe follows a Gaussian distribution. Thus each tribe focuses on exploring a specific part of the solution space. Meanwhile, tribe competition is introduced to the evolution process, which allows the winning tribes, which produce better individuals, to enlarge their sizes, i.e. having more individuals to search their parts of the solution space. This algorithm, therefore, avoids the bias on solutions and requirement of a pre-specified number of features. We have evaluated our algorithm against several state-of-the-art feature selection approaches on 20 benchmark datasets. Our results suggest that the proposed TCbGA algorithm can identify the optimal feature subset more effectively and produce more accurate pattern classification.

[1]  Parham Moradi,et al.  A hybrid particle swarm optimization for feature subset selection by integrating a novel local search strategy , 2016, Appl. Soft Comput..

[2]  Nikhil R. Pal,et al.  Genetic programming for simultaneous feature selection and classifier design , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  Thomas Marill,et al.  On the effectiveness of receptors in recognition systems , 1963, IEEE Trans. Inf. Theory.

[4]  Aboul Ella Hassanien,et al.  Binary ant lion approaches for feature selection , 2016, Neurocomputing.

[5]  Mengjie Zhang,et al.  A binary ABC algorithm based on advanced similarity scheme for feature selection , 2015, Appl. Soft Comput..

[6]  Tung-Hsu Hou,et al.  Using multi-population intelligent genetic algorithm to find the pareto-optimal parameters for a nano-particle milling process , 2008, Expert Syst. Appl..

[7]  Aboul Ella Hassanien,et al.  Binary grey wolf optimization approaches for feature selection , 2016, Neurocomputing.

[8]  Kazuyuki Murase,et al.  A new local search based hybrid genetic algorithm for feature selection , 2011, Neurocomputing.

[9]  Francisco Herrera,et al.  A First Study on the Use of Coevolutionary Algorithms for Instance and Feature Selection , 2009, HAIS.

[10]  A. Wayne Whitney,et al.  A Direct Method of Nonparametric Measurement Selection , 1971, IEEE Transactions on Computers.

[11]  Pei-Chann Chang,et al.  Two-phase sub population genetic algorithm for parallel machine-scheduling problem , 2005, Expert Syst. Appl..

[12]  Parham Moradi,et al.  An unsupervised feature selection algorithm based on ant colony optimization , 2014, Eng. Appl. Artif. Intell..

[13]  Chih-Jen Lin,et al.  Working Set Selection Using Second Order Information for Training Support Vector Machines , 2005, J. Mach. Learn. Res..

[14]  Xin Yao,et al.  A Survey on Evolutionary Computation Approaches to Feature Selection , 2016, IEEE Transactions on Evolutionary Computation.

[15]  Sanghamitra Bandyopadhyay,et al.  Unsupervised feature selection using an improved version of Differential Evolution , 2015, Expert Syst. Appl..

[16]  Jack Sklansky,et al.  A note on genetic algorithms for large-scale feature selection , 1989, Pattern Recognit. Lett..

[17]  Bo Yang,et al.  A fast feature weighting algorithm of data gravitation classification , 2017, Inf. Sci..

[18]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[19]  Yaojin Lin,et al.  Feature selection based on quality of information , 2017, Neurocomputing.

[20]  Parham Moradi,et al.  A graph theoretic approach for unsupervised feature selection , 2015, Eng. Appl. Artif. Intell..

[21]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[22]  Zhiwei Ye,et al.  A feature selection method based on modified binary coded ant colony optimization algorithm , 2016, Appl. Soft Comput..

[23]  Fakhri Karray,et al.  Hierarchical genetic algorithm with new evaluation function and bi-coded representation for the selection of features considering their confidence rate , 2011, Appl. Soft Comput..

[24]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[25]  Bing Li,et al.  Multi-Perspective Cost-Sensitive Context-Aware Multi-Instance Sparse Coding and Its Application to Sensitive Video Recognition , 2016, IEEE Transactions on Multimedia.

[26]  Michael Affenzeller,et al.  New Generic Hybrids Based upon Genetic Algorithms , 2002, IBERAMIA.

[27]  Haider Banka,et al.  A Hamming distance based binary particle swarm optimization (HDBPSO) algorithm for high dimensional feature selection, classification and validation , 2015, Pattern Recognit. Lett..

[28]  Mengjie Zhang,et al.  Multi-objective Feature Selection in Classification: A Differential Evolution Approach , 2014, SEAL.

[29]  Parham Moradi,et al.  Integration of graph clustering with ant colony optimization for feature selection , 2015, Knowl. Based Syst..

[30]  Selin Damla Ahipasaoglu,et al.  Identifying (Quasi) Equally Informative Subsets in Feature Selection Problems for Classification: A Max-Relevance Min-Redundancy Approach , 2016, IEEE Transactions on Cybernetics.

[31]  Y. Chien,et al.  Pattern classification and scene analysis , 1974 .

[32]  Chi-Man Vong,et al.  Sparse Bayesian Extreme Learning Machine for Multi-classification , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[33]  Abdul Rahman Ramli,et al.  Feature selection for high dimensional data: An evolutionary filter approach. , 2011 .

[34]  A Randomization Method to Control the Type I Error Rates in Best Subset Regression , 2008 .

[35]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[36]  Anne M. P. Canuto,et al.  A comparative analysis of genetic algorithm and ant colony optimization to select attributes for an heterogeneous ensemble of classifiers , 2010, IEEE Congress on Evolutionary Computation.

[37]  Yafei Zhang,et al.  Dynamic Adaboost learning with feature selection based on parallel genetic algorithm for image annotation , 2010, Knowl. Based Syst..

[38]  Razieh Sheikhpour,et al.  Particle swarm optimization for bandwidth determination and feature selection of kernel density estimation based classifiers in diagnosis of breast cancer , 2016, Appl. Soft Comput..

[39]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[40]  Huanhuan Chen,et al.  Robust twin boosting for feature selection from high-dimensional omics data with label noise , 2015, Inf. Sci..

[41]  Yongming Li,et al.  Research of multi-population agent genetic algorithm for feature selection , 2009, Expert Syst. Appl..

[42]  Javier Pérez-Rodríguez,et al.  Simultaneous instance and feature selection and weighting using evolutionary computation: Proposal and study , 2015, Appl. Soft Comput..

[43]  Pei-Chann Chang,et al.  The development of a sub-population genetic algorithm II (SPGA II) for multi-objective combinatorial problems , 2009, Appl. Soft Comput..

[44]  H. Abdi,et al.  Principal component analysis , 2010 .

[45]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[46]  Mahdi Eftekhari,et al.  Feature selection using multimodal optimization techniques , 2016, Neurocomputing.

[47]  Qiang Shen,et al.  Finding Rough Set Reducts with Ant Colony Optimization , 2003 .

[48]  Xiangjian He,et al.  Building an Intrusion Detection System Using a Filter-Based Feature Selection Algorithm , 2016, IEEE Transactions on Computers.

[49]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[50]  Swagatam Das,et al.  Simultaneous feature selection and weighting - An evolutionary multi-objective optimization approach , 2015, Pattern Recognit. Lett..

[51]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[52]  Hani Pourvaziri,et al.  A hybrid multi-population genetic algorithm for the dynamic facility layout problem , 2014, Appl. Soft Comput..

[53]  Y.-C. Lee,et al.  Feature selection and classification by using grid computing based evolutionary approach for the microarray data , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[54]  Yaochu Jin,et al.  Evolutionary Multiobjective Image Feature Extraction in the Presence of Noise , 2015, IEEE Transactions on Cybernetics.