Feature Selection via Chaotic Antlion Optimization

Background Selecting a subset of relevant properties from a large set of features that describe a dataset is a challenging machine learning task. In biology, for instance, the advances in the available technologies enable the generation of a very large number of biomarkers that describe the data. Choosing the more informative markers along with performing a high-accuracy classification over the data can be a daunting task, particularly if the data are high dimensional. An often adopted approach is to formulate the feature selection problem as a biobjective optimization problem, with the aim of maximizing the performance of the data analysis model (the quality of the data training fitting) while minimizing the number of features used. Results We propose an optimization approach for the feature selection problem that considers a “chaotic” version of the antlion optimizer method, a nature-inspired algorithm that mimics the hunting mechanism of antlions in nature. The balance between exploration of the search space and exploitation of the best solutions is a challenge in multi-objective optimization. The exploration/exploitation rate is controlled by the parameter I that limits the random walk range of the ants/prey. This variable is increased iteratively in a quasi-linear manner to decrease the exploration rate as the optimization progresses. The quasi-linear decrease in the variable I may lead to immature convergence in some cases and trapping in local minima in other cases. The chaotic system proposed here attempts to improve the tradeoff between exploration and exploitation. The methodology is evaluated using different chaotic maps on a number of feature selection datasets. To ensure generality, we used ten biological datasets, but we also used other types of data from various sources. The results are compared with the particle swarm optimizer and with genetic algorithm variants for feature selection using a set of quality metrics.

[1]  Ricardo Carelli,et al.  SLAM-based robotic wheelchair navigation system designed for confined spaces , 1993, 2010 IEEE International Symposium on Industrial Electronics.

[2]  Bishwajit Chakraborty,et al.  Genetic algorithm with fuzzy fitness function for feature selection , 2002, Industrial Electronics, 2002. ISIE 2002. Proceedings of the 2002 IEEE International Symposium on.

[3]  Hui-Hua Yang,et al.  Ant colony optimization based network intrusion feature selection and detection , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[4]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[5]  RASHI VOHRA,et al.  AN EFFICIENT CHAOS-BASED OPTIMIZATION ALGORITHM APPROACH FOR CRYPTOGRAPHY , 2012 .

[6]  Weizhou Zhong,et al.  Multi-objective Optimization using Chaos Based PSO , 2011 .

[7]  B. Chakraborty Feature subset selection by particle swarm optimization with fuzzy fitness function , 2008, 2008 3rd International Conference on Intelligent System and Knowledge Engineering.

[8]  Hyunsoo Kim,et al.  Dimension Reduction in Text Classification with Support Vector Machines , 2005, J. Mach. Learn. Res..

[9]  Li-Yeh Chuang,et al.  Improved binary particle swarm optimization using catfish effect for feature selection , 2011, Expert Syst. Appl..

[10]  Jonathan J. H. Zhu,et al.  Controllability of Weighted and Directed Networks with Nonidentical Node Dynamics , 2013 .

[11]  Thomas A. Runkler,et al.  Two cooperative ant colonies for feature selection using fuzzy models , 2010, Expert Syst. Appl..

[12]  Russell C. Eberhart,et al.  A new optimizer using particle swarm theory , 1995, MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science.

[13]  Seyed Mohammad Mirjalili,et al.  The Ant Lion Optimizer , 2015, Adv. Eng. Softw..

[14]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[15]  Babak Nadjar Araabi,et al.  Predicting Chaotic Time Series Using Neural and Neurofuzzy Models: A Comparative Study , 2006, Neural Processing Letters.

[16]  David G. Stork,et al.  Pattern Classification , 1973 .

[17]  Maryam Kouzehgar,et al.  A Comparison among Wolf Pack Search and Four other Optimization Algorithms , 2012 .

[18]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[19]  Thomas Marill,et al.  On the effectiveness of receptors in recognition systems , 1963, IEEE Trans. Inf. Theory.

[20]  A. Wayne Whitney,et al.  A Direct Method of Nonparametric Measurement Selection , 1971, IEEE Transactions on Computers.

[21]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[22]  Cheng-Lung Huang,et al.  ACO-based hybrid classification system with feature subset selection and model parameters optimization , 2009, Neurocomputing.

[23]  A. E. Eiben,et al.  Genetic algorithms with multi-parent recombination , 1994, PPSN.

[24]  Juan M. Aguirregabiria,et al.  Robust chaos with variable Lyapunov exponent in smooth one-dimensional maps , 2008, 0810.3781.

[25]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[26]  Mengjie Zhang,et al.  Genetic Programming for Feature Subset Ranking in Binary Classification Problems , 2009, EuroGP.

[27]  Hedieh Sajedi,et al.  SMS Spam Filtering Using Machine Learning Techniques: A Survey , 2016 .

[28]  Byung Ro Moon,et al.  Hybrid Genetic Algorithms for Feature Selection , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  He Ming A Rough Set Based Hybrid Method to Feature Selection , 2008, 2008 International Symposium on Knowledge Acquisition and Modeling.

[30]  Mohamed Abdel-Baset,et al.  An Improved Chaotic Bat Algorithm for Solving Integer Programming Problems , 2014 .

[31]  Mengjie Zhang,et al.  Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms , 2014, Appl. Soft Comput..

[32]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[33]  Mengjie Zhang,et al.  Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach , 2013, IEEE Transactions on Cybernetics.

[34]  J. Raymundo Marcial-Romero,et al.  Chaotic Time Series Prediction with Feature Selection Evolution , 2011, 2011 IEEE Electronics, Robotics and Automotive Mechanics Conference.

[35]  Duoqian Miao,et al.  A rough set approach to feature selection based on ant colony optimization , 2010, Pattern Recognit. Lett..

[36]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[37]  Andrew Lewis,et al.  Biogeography-based optimisation with chaos , 2014, Neural Computing and Applications.

[38]  Jiang Chuanwen,et al.  A hybrid method of chaotic particle swarm optimization and linear interior for reactive power optimisation , 2005, Math. Comput. Simul..

[39]  Xin-She Yang,et al.  Nature-Inspired Metaheuristic Algorithms , 2008 .

[40]  B. Raman,et al.  Instance Based Filter for Feature Selection , 2002 .

[41]  Hao Chen,et al.  A Heuristic Feature Selection Approach for Text Categorization by Using Chaos Optimization and Genetic Algorithm , 2013 .