Whale optimization approaches for wrapper feature selection

Abstract Classification accuracy highly dependents on the nature of the features in a dataset which may contain irrelevant or redundant data. The main aim of feature selection is to eliminate these types of features to enhance the classification accuracy. The wrapper feature selection model works on the feature set to reduce the number of features and improve the classification accuracy simultaneously. In this work, a new wrapper feature selection approach is proposed based on Whale Optimization Algorithm (WOA). WOA is a newly proposed algorithm that has not been systematically applied to feature selection problems yet. Two binary variants of the WOA algorithm are proposed to search the optimal feature subsets for classification purposes. In the first one, we aim to study the influence of using the Tournament and Roulette Wheel selection mechanisms instead of using a random operator in the searching process. In the second approach, crossover and mutation operators are used to enhance the exploitation of the WOA algorithm. The proposed methods are tested on standard benchmark datasets and then compared to three algorithms such as Particle Swarm Optimization (PSO), Genetic Algorithm (GA), the Ant Lion Optimizer (ALO), and five standard filter feature selection methods. The paper also considers an extensive study of the parameter setting for the proposed technique. The results show the efficiency of the proposed approaches in searching for the optimal feature subsets.

[1]  Marcel J. T. Reinders,et al.  Random subspace method for multivariate feature selection , 2006, Pattern Recognit. Lett..

[2]  Huan Liu,et al.  Advancing feature selection research , 2010 .

[3]  Seyed Mohammad Mirjalili,et al.  The Ant Lion Optimizer , 2015, Adv. Eng. Softw..

[4]  Andrew Lewis,et al.  The Whale Optimization Algorithm , 2016, Adv. Eng. Softw..

[5]  Petros Drineas,et al.  Feature Selection for Ridge Regression with Provable Guarantees , 2016, Neural Computation.

[6]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[7]  Majdi M. Mafarja,et al.  Hybrid Whale Optimization Algorithm with simulated annealing for feature selection , 2017, Neurocomputing.

[8]  Hossam M. Zawbaa,et al.  Feature selection based on antlion optimization algorithm , 2015, 2015 Third World Conference on Complex Systems (WCCS).

[9]  Rafael Bello,et al.  Two-Step Particle Swarm Optimization to Solve the Feature Selection Problem , 2007, Seventh International Conference on Intelligent Systems Design and Applications (ISDA 2007).

[10]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[11]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Hao Chen,et al.  A Heuristic Feature Selection Approach for Text Categorization by Using Chaos Optimization and Genetic Algorithm , 2013 .

[13]  Lloyd A. Smith,et al.  Feature Selection for Machine Learning: Comparing a Correlation-Based Filter Approach to the Wrapper , 1999, FLAIRS.

[14]  Philippe Renevey,et al.  SVM-based recursive feature elimination to compare phase synchronization computed from broadband and narrowband EEG signals in Brain-Computer Interfaces , 2005, Signal Process..

[15]  Salwani Abdullah,et al.  Fuzzy Modified Great Deluge Algorithm for Attribute Reduction , 2014, SCDM.

[16]  Janez Brest,et al.  A Brief Review of Nature-Inspired Algorithms for Optimization , 2013, ArXiv.

[17]  Salwani Abdullah,et al.  A fuzzy record-to-record travel algorithm for solving rough set attribute reduction , 2015, Int. J. Syst. Sci..

[18]  Salwani Abdullah,et al.  Fuzzy Population-Based Meta-Heuristic Approaches for Attribute Reduction in Rough Set Theory , 2015 .

[19]  S. Muthukrishnan,et al.  Relative-Error CUR Matrix Decompositions , 2007, SIAM J. Matrix Anal. Appl..

[20]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[21]  Hossam M. Zawbaa,et al.  Feature selection approach based on whale optimization algorithm , 2017, 2017 Ninth International Conference on Advanced Computational Intelligence (ICACI).

[22]  Aboul Ella Hassanien,et al.  Binary ant lion approaches for feature selection , 2016, Neurocomputing.

[23]  Qiang Shen,et al.  Semantics-preserving dimensionality reduction: rough and fuzzy-rough-based approaches , 2004, IEEE Transactions on Knowledge and Data Engineering.

[24]  David G. Stork,et al.  Pattern Classification , 1973 .

[25]  Thomas Bäck,et al.  Evolutionary algorithms in theory and practice - evolution strategies, evolutionary programming, genetic algorithms , 1996 .

[26]  Salwani Abdullah,et al.  Investigating memetic algorithm in solving rough set attribute reduction , 2013, Int. J. Comput. Appl. Technol..

[27]  Petros Drineas,et al.  Column Selection via Adaptive Sampling , 2015, NIPS.

[28]  El-Ghazali Talbi,et al.  Metaheuristics - From Design to Implementation , 2009 .

[29]  Huan Liu,et al.  Spectral feature selection for supervised and unsupervised learning , 2007, ICML '07.

[30]  Salwani Abdullah,et al.  Modified great deluge for attribute reduction in rough set theory , 2011, 2011 Eighth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD).

[31]  Li-Yeh Chuang,et al.  Improved binary PSO for feature selection using gene expression data , 2008, Comput. Biol. Chem..

[32]  Petros Drineas,et al.  Feature selection for linear SVM with provable guarantees , 2014, Pattern Recognit..

[33]  Christos Boutsidis,et al.  Near-Optimal Column-Based Matrix Reconstruction , 2014, SIAM J. Comput..

[34]  C. Furlanello,et al.  Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products , 2006 .

[35]  Jacek M. Zurada,et al.  Normalized Mutual Information Feature Selection , 2009, IEEE Transactions on Neural Networks.

[36]  Salwani Abdullah,et al.  Record-to-Record Travel algorithm for attribute reduction in rough set theory , 2013 .

[37]  B. Chakraborty Feature subset selection by particle swarm optimization with fuzzy fitness function , 2008, 2008 3rd International Conference on Intelligent System and Knowledge Engineering.

[38]  Andrew Lewis,et al.  Grey Wolf Optimizer , 2014, Adv. Eng. Softw..

[39]  David E. Goldberg,et al.  Genetic Algorithms, Tournament Selection, and the Effects of Noise , 1995, Complex Syst..

[40]  Hiroshi Motoda,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998, The Springer International Series in Engineering and Computer Science.

[41]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[42]  Ning Zhong,et al.  Using Rough Sets with Heuristics for Feature Selection , 1999, RSFDGrC.

[43]  Abdel-Rahman Hedar,et al.  Scatter Search for Rough Set Attribute Reduction , 2007 .

[44]  Crina Grosan,et al.  Feature Subset Selection Approach by Gray-Wolf Optimization , 2014, AECIA.

[45]  Zexuan Zhu,et al.  Wrapper–Filter Feature Selection Algorithm Using a Memetic Framework , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[46]  Heba Abusamra,et al.  A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data of Glioma , 2013 .

[47]  Mengjie Zhang,et al.  Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms , 2014, Appl. Soft Comput..

[48]  Xiangyang Wang,et al.  Feature selection based on rough sets and particle swarm optimization , 2007, Pattern Recognit. Lett..

[49]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[50]  Xuegong Zhang,et al.  Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data , 2006, BMC Bioinformatics.

[51]  Zuren Feng,et al.  An efficient ant colony optimization approach to attribute reduction in rough set theory , 2008, Pattern Recognit. Lett..

[52]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[53]  Christos Boutsidis,et al.  Unsupervised Feature Selection for the $k$-means Clustering Problem , 2009, NIPS.

[54]  Crina Grosan,et al.  Feature Selection via Chaotic Antlion Optimization , 2016, PloS one.

[55]  Aboul Ella Hassanien,et al.  Binary grey wolf optimization approaches for feature selection , 2016, Neurocomputing.

[56]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[57]  Richard Jensen,et al.  Combining rough and fuzzy sets for feature selection , 2004 .

[58]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.