A memetic algorithm using emperor penguin and social engineering optimization for medical data classification

Abstract Gene selection and classification of microarray data play an important role in cancer diagnosis and treatment. One of the most popular and faster classification model is support vector machine (SVM). However, the major challenge in SVM lies in the selection of its two parameters, namely, regularization parameter C and kernel parameter γ . Attempts have been made to improve the performance of SVM by tuning these two parameters with the help of metaheuristics. Although existing metaheuristics can search the promising regions of the search space, they are unable to explore the global optimum efficiently. In this paper, a memetic algorithm-based SVM (M-SVM) is presented for simultaneous feature selection and optimization of SVM parameters. The memetic algorithm is a fusion of local search strategy using social engineering optimizer (SEO) and global optimization framework using emperor penguin optimizer (EPO). The idea of embedding SEO in EPO is to enhance the exploitation capability of EPO. The performance of our algorithm is evaluated on seven standard benchmark datasets. To prove the efficacy of the method, it is compared with particle swarm optimization based SVM (PSO-SVM), genetic algorithm based SVM (GA-SVM), and fifteen other state-of-the-art methods. The experimental results confirm that the proposed method significantly outperforms other existing techniques in terms of accuracy and number of selected genes. The proposed method is validated using a statistical analysis, namely, ANOVA.

[1]  Weiping Ding,et al.  A Layered-Coevolution-Based Attribute-Boosted Reduction Using Adaptive Quantum-Behavior PSO and Its Consistent Segmentation for Neonates Brain Tissue , 2018, IEEE Transactions on Fuzzy Systems.

[2]  Parham Moradi,et al.  Gene selection for microarray data classification using a novel ant colony optimization , 2015, Neurocomputing.

[3]  Gil Alterovitz,et al.  Incremental wrapper based gene selection with Markov blanket , 2014, 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[4]  Yongjin Lu,et al.  Informative Gene Selection for Microarray Classification via Adaptive Elastic Net with Conditional Mutual Information , 2018, Applied Mathematical Modelling.

[5]  Francisco Herrera,et al.  Integrating a differential evolution feature weighting scheme into prototype generation , 2012, Neurocomputing.

[6]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Francisco Herrera,et al.  MC2ESVM: Multiclass Classification Based on Cooperative Evolution of Support Vector Machines , 2018, IEEE Computational Intelligence Magazine.

[8]  Lars Junghans,et al.  Hybrid single objective genetic algorithm coupled with the simulated annealing optimization method for building optimization , 2015 .

[9]  Ghada Hany Badr,et al.  Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification , 2015, Comput. Biol. Chem..

[10]  K. Lenin,et al.  Hybrid Tabu search-simulated annealing method to solve optimal reactive power problem , 2016 .

[11]  Majdi M. Mafarja,et al.  Hybrid Whale Optimization Algorithm with simulated annealing for feature selection , 2017, Neurocomputing.

[12]  Yu Lin,et al.  Developing a dynamic neighborhood structure for an adaptive hybrid simulated annealing - tabu search algorithm to solve the symmetrical traveling salesman problem , 2016, Appl. Soft Comput..

[13]  Hala Alshamlan,et al.  mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling , 2015, BioMed research international.

[14]  Vijay Kumar,et al.  Emperor penguin optimizer: A bio-inspired algorithm for engineering problems , 2018, Knowl. Based Syst..

[15]  Weiping Ding,et al.  Deep Neuro-Cognitive Co-Evolution for Fuzzy Attribute Reduction by Quantum Leaping PSO With Nearest-Neighbor Memeplexes , 2019, IEEE Transactions on Cybernetics.

[16]  Khan Muhammad,et al.  Analysis of high-dimensional genomic data employing a novel bio-inspired algorithm , 2019, Appl. Soft Comput..

[17]  Francisco Herrera,et al.  Monotonic Random Forest with an Ensemble Pruning Mechanism based on the Degree of Monotonicity , 2015, New Generation Computing.

[18]  Eneko Osaba,et al.  Ensemble classification for imbalanced data based on feature space partitioning and hybrid metaheuristics , 2019, Applied Intelligence.

[19]  M. Balafar,et al.  Gene selection for microarray cancer classification using a new evolutionary method employing artificial intelligence concepts. , 2017, Genomics.

[20]  Santos Kumar Baliarsingh,et al.  Biclustering of Microarray Data Employing Multiobjective GA , 2017, 2017 14th IEEE India Council International Conference (INDICON).

[21]  El-Ghazali Talbi,et al.  Metaheuristics - From Design to Implementation , 2009 .

[22]  El-Ghazali Talbi,et al.  Comparison of population based metaheuristics for feature selection: Application to microarray data classification , 2008, 2008 IEEE/ACS International Conference on Computer Systems and Applications.

[23]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[24]  E. Petricoin,et al.  Use of proteomic patterns in serum to identify ovarian cancer , 2002, The Lancet.

[25]  Verónica Bolón-Canedo,et al.  Distributed feature selection: An application to microarray data classification , 2015, Appl. Soft Comput..

[26]  Sambit Bakshi,et al.  Analysis of high-dimensional biomedical data using an evolutionary multi-objective emperor penguin optimizer , 2019, Swarm Evol. Comput..

[27]  Jianzhong Xu,et al.  Hybrid Nelder–Mead Algorithm and Dragonfly Algorithm for Function Optimization and the Training of a Multilayer Perceptron , 2019 .

[28]  Francisco Herrera,et al.  Transforming big data into smart data: An insight on the use of the k‐nearest neighbors algorithm to obtain quality data , 2018, WIREs Data Mining Knowl. Discov..

[29]  Shaoning Pang,et al.  Classification consistency analysis for bootstrapping gene selection , 2007, Neural Computing and Applications.

[30]  Zexuan Zhu,et al.  Markov blanket-embedded genetic algorithm for gene selection , 2007, Pattern Recognit..

[31]  Aijun Yang,et al.  Bayesian variable selection with sparse and correlation priors for high-dimensional data analysis , 2017, Comput. Stat..

[32]  Jia Wu,et al.  Memetic Extreme Learning Machine , 2016, Pattern Recognit..

[33]  Enrique Alba,et al.  Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments , 2016, Appl. Soft Comput..

[34]  T. Raghunadha Reddy,et al.  Gender Prediction in Author Profiling Using ReliefF Feature Selection Algorithm , 2018 .

[35]  Vincent T. Y. Ng,et al.  A Hierarchical Ensemble of ECOC for cancer classification based on multi-class microarray data , 2016, Inf. Sci..

[36]  Yuehua Li,et al.  A Cascaded Co-evolutionary Model for Attribute Reduction and Classification Based on Coordinating Architecture with Bidirectional Elitist Optimization , 2017 .

[37]  Erik Cambria,et al.  Ensemble application of convolutional neural networks and multiple kernel learning for multimodal sentiment analysis , 2017, Neurocomputing.

[38]  Muhammad Hisyam Lee,et al.  A two-stage sparse logistic regression for optimal gene selection in high-dimensional microarray data classification , 2018, Advances in Data Analysis and Classification.

[39]  Davide Anguita,et al.  Statistical Learning Theory and ELM for Big Social Data Analysis , 2016, IEEE Computational Intelligence Magazine.

[40]  Madhubanti Maitra,et al.  Gene selection from microarray gene expression data for classification of cancer subgroups employing PSO and adaptive K-nearest neighborhood technique , 2015, Expert Syst. Appl..

[41]  Sreejit Chakravarty,et al.  Microarray medical data classification using kernel ridge regression and modified cat swarm optimization based gene selection system , 2016, Swarm Evol. Comput..

[42]  Salvador García,et al.  Cost-Sensitive back-propagation neural networks with binarization techniques in addressing multi-class problems and non-competent classifiers , 2017, Appl. Soft Comput..

[43]  Ali Najafi,et al.  A hybrid gene selection algorithm for microarray cancer classification using genetic algorithm and learning automata , 2017 .

[44]  Arunkumar Chinnaswamy,et al.  Hybrid Feature Selection Using Correlation Coefficient and Particle Swarm Optimization on Microarray Gene Expression Data , 2015, IBICA.

[45]  Reza Tavakkoli-Moghaddam,et al.  The Social Engineering Optimizer (SEO) , 2018, Eng. Appl. Artif. Intell..

[46]  Paul Schonfeld,et al.  Hybrid simulated annealing and genetic algorithm for optimizing arterial signal timings under oversaturated traffic conditions , 2015 .

[47]  E. Lander,et al.  Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[48]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[49]  Zakariya Yahya Algamal,et al.  A new hybrid firefly algorithm and particle swarm optimization for tuning parameter estimation in penalized support vector machine with application in chemometrics , 2019, Chemometrics and Intelligent Laboratory Systems.

[50]  Jiawei Han,et al.  Generalized Fisher Score for Feature Selection , 2011, UAI.