Feature selection by using chaotic cuckoo optimization algorithm with levy flight, opposition-based learning and disruption operator

Feature selection, which plays an important role in high-dimensional data analysis, is drawing increasing attention recently. Finding the most relevant and important features for classifications are one of the most important tasks of data mining and machine learning, since all of the datasets have irrelevant features that affect accuracy rate and slow down the classifier. Feature selection is an optimization process, which improves the accuracy rate of data classification and reduces the number of selected features. Applying too many features both requires a large memory capacity and leads to a slow execution speed. Feature selection algorithms are often responsible to decide which features should be selected to be used during a classification algorithm. Traditional algorithms seemed to be inefficient due to the complexity of dimensions of the problem, thus evolutionary algorithms were used to improve the problem solving process. The algorithm proposed in this paper, chaotic cuckoo optimization algorithm with levy flight, disruption operator and opposition-based learning (CCOALFDO), is applied to select the optimal feature subspace for classification. It reduces the randomization in selecting features and avoids getting stuck in local optimum solutions which lead to a more interesting feature subset. Extensive experiments are conducted on 20 high-dimensional datasets to demonstrate the effectiveness and efficiency of the proposed method. The results showed the superiority of the proposed method to state-of-the-art methods in terms of classification accuracy rate. In addition, they prove the ability of the CCOALFDO in selecting the most relevant features for classification tasks. Thus, it is a reasonable solution in handling noise and avoiding serious negative impacts on the classification accuracy rate in real world datasets.

[1]  Dae-Won Kim,et al.  Efficient Multi-Label Feature Selection Using Entropy-Based Label Selection , 2016, Entropy.

[2]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[3]  S. Kanmani,et al.  A hybrid algorithm using ant and bee colony optimization for feature selection and classification (AC-ABC Hybrid) , 2017, Swarm Evol. Comput..

[4]  V. Jothiprakash,et al.  Optimization of Hydropower Reservoir Using Evolutionary Algorithms Coupled with Chaos , 2013, Water Resources Management.

[5]  Chen Tian-Lun,et al.  Application of Chaos in Genetic Algorithms , 2002 .

[6]  Mohammed Aladeemy,et al.  New feature selection methods based on opposition-based learning and self-adaptive cohort intelligence for predicting patient no-shows , 2020, Appl. Soft Comput..

[7]  G. Viswanathan,et al.  Lévy flights and superdiffusion in the context of biological encounters and random searches , 2008 .

[8]  Francisco Herrera,et al.  A survey on data preprocessing for data stream mining: Current status and future directions , 2017, Neurocomputing.

[9]  Siddhartha Bhattacharyya,et al.  S-shaped Binary Whale Optimization Algorithm for Feature Selection , 2019 .

[10]  Leandro dos Santos Coelho,et al.  A V-Shaped Binary Crow Search Algorithm for Feature Selection , 2018, 2018 IEEE Congress on Evolutionary Computation (CEC).

[11]  Nicu Sebe,et al.  Deep and fast: Deep learning hashing with semi-supervised graph construction , 2016, Image Vis. Comput..

[12]  Jianping Yin,et al.  Foreword to the special issue on recent advances on pattern recognition and artificial intelligence , 2018, Neural Computing and Applications.

[13]  Dan Simon,et al.  Biogeography-based optimization combined with evolutionary strategy and immigration refusal , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[14]  Mohamed Cheriet,et al.  Gabor filter-based texture for ancient degraded document image binarization , 2018, Pattern Analysis and Applications.

[15]  Geng Yang,et al.  Feature extraction based on graph discriminant embedding and its applications to face recognition , 2018, Soft Comput..

[16]  Simon Laflamme,et al.  Variable input observer for nonstationary high-rate dynamic systems , 2018, Neural Computing and Applications.

[17]  Nabil Neggaz,et al.  An efficient henry gas solubility optimization for feature selection , 2020, Expert Syst. Appl..

[18]  Ahmed M. Anter,et al.  Intelligent Hybrid Approach for Feature Selection , 2019, AMLTA.

[19]  Constantin F. Aliferis,et al.  A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis , 2004, Bioinform..

[20]  Zhi Zhong,et al.  Adaptive graph learning and low-rank constraint for supervised spectral feature selection , 2019, Neural Computing and Applications.

[21]  Anna Syberfeldt,et al.  Multi-Objective Optimization of a Real-World Manufacturing Process Using Cuckoo Search , 2014 .

[22]  Chris H. Q. Ding,et al.  Minimum Redundancy Feature Selection from Microarray Gene Expression Data , 2005, J. Bioinform. Comput. Biol..

[23]  Amir Hossein Gandomi,et al.  Cuckoo search algorithm: a metaheuristic approach to solve structural optimization problems , 2011, Engineering with Computers.

[24]  Asif Ekbal,et al.  Feature selection for entity extraction from multiple biomedical corpora: A PSO-based approach , 2018, Soft Comput..

[25]  Leticia M. Seijas,et al.  Artificial Bee Colony Optimization for Feature Selection of Traffic Sign Recognition , 2017, Int. J. Swarm Intell. Res..

[26]  Hossam M. Zawbaa,et al.  Feature selection via Lèvy Antlion optimization , 2018, Pattern Analysis and Applications.

[27]  Sankalap Arora,et al.  Binary butterfly optimization approaches for feature selection , 2019, Expert Syst. Appl..

[28]  Cheng-Lung Huang,et al.  A distributed PSO-SVM hybrid system with feature selection and parameter optimization , 2008, Appl. Soft Comput..

[29]  Ahmed M. Anter,et al.  Feature selection strategy based on hybrid crow search optimization algorithm integrated with chaos theory and fuzzy c-means algorithm for medical diagnosis problems , 2019, Soft Computing.

[30]  Liang He,et al.  Semi-supervised minimum redundancy maximum relevance feature selection for audio classification , 2016, Multimedia Tools and Applications.

[31]  Bin Hu,et al.  Feature selection of high-dimensional biomedical data using improved SFLA for disease diagnosis , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[32]  Habibollah Haron,et al.  Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[33]  HerreraFrancisco,et al.  A survey on data preprocessing for data stream mining , 2017 .

[34]  Xin-She Yang,et al.  Cuckoo Search via Lévy flights , 2009, 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC).

[35]  Bhavani M. Thuraisingham,et al.  Classification and Novel Class Detection of Data Streams in a Dynamic Feature Space , 2010, ECML/PKDD.

[36]  Amir Hossein Gandomi,et al.  Chaotic Krill Herd algorithm , 2014, Inf. Sci..

[37]  T. Jayabarathi,et al.  The Bat Algorithm, Variants and Some Practical Engineering Applications: A Review , 2018 .

[38]  Chunying Cheng,et al.  Network Intrusion Detection with Bat Algorithm for Synchronization of Feature Selection and Support Vector Machines , 2016, ISNN.

[39]  Mohsin Rizwan,et al.  Online adaptive PID tracking control of an aero-pendulum using PSO-scaled fuzzy gain adjustment mechanism , 2020, Soft Comput..

[40]  Chaokun Yan,et al.  Hybrid binary Coral Reefs Optimization algorithm with Simulated Annealing for Feature Selection in high-dimensional biomedical datasets , 2019, Chemometrics and Intelligent Laboratory Systems.

[41]  Binggang Cao,et al.  Self-Adaptive Chaos Differential Evolution , 2006, ICNC.

[42]  Aboul Ella Hassanien,et al.  Modified cuckoo search algorithm with rough sets for feature selection , 2018, Neural Computing and Applications.

[43]  B. Alatas,et al.  Chaos embedded particle swarm optimization algorithms , 2009 .

[44]  Masoumeh Zare,et al.  Supervised feature selection via matrix factorization based on singular value decomposition , 2019, Chemometrics and Intelligent Laboratory Systems.

[45]  PesBarbara,et al.  Exploiting the ensemble paradigm for stable feature selection , 2017 .

[46]  Haider Banka,et al.  A Rough Based Hybrid Binary PSO Algorithm for Flat Feature Selection and Classification in Gene Expression Data , 2017 .

[47]  Lalit Kumar,et al.  An Improved BPSO Algorithm for Feature Selection , 2018, Lecture Notes in Electrical Engineering.

[48]  S. C. Neoh,et al.  A Micro-GA Embedded PSO Feature Selection Approach to Intelligent Facial Emotion Recognition , 2017, IEEE Transactions on Cybernetics.

[49]  Amir Hossein Alavi,et al.  A comprehensive review of krill herd algorithm: variants, hybrids and applications , 2017, Artificial Intelligence Review.

[50]  Sachin Ahuja,et al.  Swarm Intelligence for Feature Selection: A Review of Literature and Reflection on Future Challenges , 2019 .

[51]  M. Klocke,et al.  Prediction of the biogas production using GA and ACO input features selection method for ANN model , 2019, Information Processing in Agriculture.

[52]  Feiping Nie,et al.  Semi-Supervised Feature Selection via Insensitive Sparse Regression with Application to Video Semantic Recognition , 2018, IEEE Transactions on Knowledge and Data Engineering.

[53]  Nicoletta Dessì,et al.  Exploiting the ensemble paradigm for stable feature selection: A case study on high-dimensional genomic data , 2017, Inf. Fusion.

[54]  L. Liming,et al.  Genetic Algorithm in Chaos , 2001 .

[55]  Sreeram Ramakrishnan,et al.  A hybrid approach for feature subset selection using neural networks and ant colony optimization , 2007, Expert Syst. Appl..

[56]  Aboul Ella Hassanien,et al.  Feature selection via a novel chaotic crow search algorithm , 2017, Neural Computing and Applications.

[57]  Simon Fong,et al.  Swarm Search for Feature Selection in Classification , 2013, 2013 IEEE 16th International Conference on Computational Science and Engineering.

[58]  Bangsen Tian,et al.  Multi-temporal SAR image classification of coastal plain wetlands using a new feature selection method and random forests , 2018, Remote Sensing Letters.

[59]  Parashuram Bannigidad,et al.  Age-Type Identification and Recognition of Historical Kannada Handwritten Document Images Using HOG Feature Descriptors , 2019 .

[60]  Dan Simon,et al.  Biogeography-Based Optimization , 2022 .

[61]  Javad Hamidzadeh,et al.  Weighted support vector data description based on chaotic bat algorithm , 2017, Appl. Soft Comput..

[62]  Aboul Ella Hassanien,et al.  Binary grey wolf optimization approaches for feature selection , 2016, Neurocomputing.

[63]  Hans-Peter Kriegel,et al.  Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering , 2009, TKDD.

[64]  Nabil Neggaz,et al.  Boosting salp swarm algorithm by sine cosine algorithm and disrupt operator for feature selection , 2020, Expert Syst. Appl..

[65]  Ramin Rajabioun,et al.  Cuckoo Optimization Algorithm , 2011, Appl. Soft Comput..

[66]  Mohamed Elhoseny,et al.  Feature selection based on artificial bee colony and gradient boosting decision tree , 2019, Appl. Soft Comput..

[67]  D. M. Mamatha,et al.  GA-Based Feature Selection for Squid’s Classification , 2019, Advances in Intelligent Systems and Computing.

[68]  Yong Fan,et al.  Feature selection by optimizing a lower bound of conditional mutual information , 2017, Inf. Sci..

[69]  Jon Atli Benediktsson,et al.  One-Class Oriented Feature Selection and Classification of Heterogeneous Remote Sensing Images , 2016, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[70]  Vaishali Sahare,et al.  Design and implementation of ACO feature selection algorithm for data stream mining , 2016, 2016 International Conference on Automatic Control and Dynamic Optimization Techniques (ICACDOT).

[71]  P. K. Chattopadhyay,et al.  Hybrid Differential Evolution With Biogeography-Based Optimization for Solution of Economic Load Dispatch , 2010, IEEE Transactions on Power Systems.

[72]  Amir Hossein Gandomi,et al.  Chaotic bat algorithm , 2014, J. Comput. Sci..

[73]  Majdi M. Mafarja,et al.  Whale Optimisation Algorithm for high-dimensional small-instance feature selection , 2019, Int. J. Parallel Emergent Distributed Syst..

[74]  Mansour Sheikhan,et al.  Hybrid of binary gravitational search algorithm and mutual information for feature selection in intrusion detection systems , 2015, Soft Computing.

[75]  Houbing Song,et al.  Feature selection and multiple kernel boosting framework based on PSO with mutation mechanism for hyperspectral classification , 2017, Neurocomputing.

[76]  Meng Wang,et al.  Dictionary learning feature space via sparse representation classification for facial expression recognition , 2017, Artificial Intelligence Review.

[77]  Hadi Sadoghi Yazdi,et al.  IRAHC: Instance Reduction Algorithm using Hyperrectangle Clustering , 2015, Pattern Recognit..

[78]  Diego Oliva,et al.  An improved brainstorm optimization using chaotic opposite-based learning with disruption operator for global optimization and feature selection , 2020, Soft Computing.

[79]  Andrew Lewis,et al.  Biogeography-based optimisation with chaos , 2014, Neural Computing and Applications.

[80]  Andrew Lewis,et al.  Grey Wolf Optimizer , 2014, Adv. Eng. Softw..

[81]  Athanasios V. Vasilakos,et al.  Accelerated PSO Swarm Search Feature Selection for Data Stream Mining Big Data , 2016, IEEE Transactions on Services Computing.

[82]  Seyed Mohammad Mirjalili,et al.  Whale optimization approaches for wrapper feature selection , 2018, Appl. Soft Comput..

[83]  Bin Hu,et al.  Feature Selection for Optimized High-Dimensional Biomedical Data Using an Improved Shuffled Frog Leaping Algorithm , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[84]  Ali Nazari,et al.  Predicting the effects of nanoparticles on early age compressive strength of ash-based geopolymers by artificial neural networks , 2012, Neural Computing and Applications.

[85]  José Fco. Martínez-Trinidad,et al.  A review of unsupervised feature selection methods , 2019, Artificial Intelligence Review.

[86]  Jin Song Dong,et al.  Binary Harris Hawks Optimizer for High-Dimensional, Low Sample Size Feature Selection , 2019, Algorithms for Intelligent Systems.

[87]  Seyed Mohammad Mirjalili,et al.  Chaotic krill herd optimization algorithm , 2014 .

[88]  Hong Shi,et al.  Semi-supervised Feature Selection Based on Least Square Regression with Redundancy Minimization , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[89]  Javad Hamidzadeh,et al.  Belief-based chaotic algorithm for support vector data description , 2019, Soft Comput..

[90]  J DhaliaSweetlin,et al.  Feature selection using ant colony optimization with tandem-run recruitment to diagnose bronchitis from CT scan images , 2017, Comput. Methods Programs Biomed..

[91]  Luis A. M. Pereira,et al.  A Binary Krill Herd Approach for Feature Selection , 2014, 2014 22nd International Conference on Pattern Recognition.

[92]  Leandro dos Santos Coelho,et al.  Use of chaotic sequences in a biologically inspired algorithm for engineering design optimization , 2008, Expert Syst. Appl..

[93]  John Yearwood,et al.  A Hybrid Feature Selection With Ensemble Classification for Imbalanced Healthcare Data: A Case Study for Brain Tumor Diagnosis , 2016, IEEE Access.

[94]  Hossam Faris,et al.  Binary grasshopper optimisation algorithm approaches for feature selection problems , 2019, Expert Syst. Appl..

[95]  Juan M. Fernández-Luna,et al.  RankPSO: A New L2R algorithm Based on Particle Swarm Optimization , 2014, J. Multiple Valued Log. Soft Comput..

[96]  Tao Li,et al.  Recent advances in feature selection and its applications , 2017, Knowledge and Information Systems.