Selection of informative genes from high-dimensional cancerous data employing an improvised meta-heuristic algorithm

Identification of the most prominent genes with high classification accuracy in the high-dimensional cancerous data has remained an emerging challenge for machine learning researchers. The selection of informative genes is a non-deterministic polynomial-time (NP-Hard) issue. Therefore, a scope always lies in employing new algorithms in this field. In this work, an improved version of a meta-heuristic algorithm, Chaotic Jaya (CJaya), is hybridized with Kernel Extreme Learning Machine (KELM), called as CJaya-KELM, to select the most informative genes and classify the high-dimensional cancerous data. Initially, the Fisher score technique is used to pre-select the informative genes. Then, the CJaya algorithm is employed for both selecting key genes and optimizing the parameters of the KELM classifier. To evaluate the designed model, six cancerous datasets are considered. Here, the designed model CJaya-KELM, has been compared with particle swarm optimization hybridized KELM (PSO-KELM), genetic algorithm hybridized KELM (GA-KELM), and Jaya hybridized KELM (Jaya-KELM) models. Moreover, a comparison between the suggested model with other ten existing models is demonstrated here. Some performance metrices like accuracy in tenfold cross-validation method, the number of selected genes, sensitivity, F-measure, specificity, and Matthews correlation coefficient (MCC) are applied to measure the efficiency of the suggested model. The CJaya-KELM approach resulted in the highest accuracy, sensitivity and specificity in Colon tumor (.9677, .9714, .963), Leukemia (.99, .9756, 1), Ovarian cancer (1, 1, .9892), Lymphoma-3 (.9971, 1, .9583), ALL-AML-3 (.9961, .9767, 1) and SRBCT (.998, 1, .9583) datasets respectively. The experimental results reveal that the suggested model CJaya-KELM is outperforming.

[1]  Zexuan Zhu,et al.  Markov blanket-embedded genetic algorithm for gene selection , 2007, Pattern Recognit..

[2]  Panos M. Pardalos,et al.  hGA: Hybrid genetic algorithm in fuzzy rule-based classification systems for high-dimensional problems , 2012, Appl. Soft Comput..

[3]  Ghada Hany Badr,et al.  Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification , 2015, Comput. Biol. Chem..

[4]  César Hervás-Martínez,et al.  Evolutionary Generalized Radial Basis Function neural networks for improving prediction accuracy in gene classification using feature selection , 2012, Appl. Soft Comput..

[5]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[6]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[7]  Dianhui Wang,et al.  Extreme learning machines: a survey , 2011, Int. J. Mach. Learn. Cybern..

[8]  Alexandros Iosifidis,et al.  Approximate kernel extreme learning machine for large scale data classification , 2017, Neurocomputing.

[9]  Eric O. Postma,et al.  Dimensionality Reduction: A Comparative Review , 2008 .

[10]  Ajith Abraham,et al.  A Harmony Search Based Gradient Descent Learning-FLANN (HS-GDL-FLANN) for Classification , 2015 .

[11]  Mark van Heeswijk,et al.  Advances in Extreme Learning Machines , 2015 .

[12]  Jaya Lakshmi Ravipudi,et al.  Synthesis of linear antenna arrays using Jaya, self-adaptive Jaya and chaotic Jaya algorithms , 2018 .

[13]  Hongming Zhou,et al.  Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[14]  Li-Yeh Chuang,et al.  Chaotic catfish particle swarm optimization for solving global numerical optimization problems , 2011, Appl. Math. Comput..

[15]  Khan Muhammad,et al.  Analysis of high-dimensional genomic data employing a novel bio-inspired algorithm , 2019, Appl. Soft Comput..

[16]  Bilal Alatas,et al.  Chaotic harmony search algorithms , 2010, Appl. Math. Comput..

[17]  José García-Nieto,et al.  Parallel multi-swarm optimizer for gene selection in DNA microarrays , 2011, Applied Intelligence.

[18]  Qiaoyan Wen,et al.  Hybrid chaotic ant swarm optimization , 2009 .

[19]  Dianhui Wang,et al.  Advances in extreme learning machines (ELM2014) , 2011, Neurocomputing.

[20]  Jin-Kao Hao,et al.  A Genetic Embedded Approach for Gene Selection and Classification of Microarray Data , 2007, EvoBIO.

[21]  Imen Ben Mansour,et al.  A gradual weight-based ant colony approach for solving the multiobjective multidimensional knapsack problem , 2019, Evol. Intell..

[22]  Imen Ben Mansour,et al.  Indicator Based Ant Colony Optimization for Multi-objective Knapsack Problem , 2015, KES.

[23]  R. Rao Jaya: A simple and new optimization algorithm for solving constrained and unconstrained optimization problems , 2016 .

[24]  Sang-Bong Rhee,et al.  A Novel Multi-Population Based Chaotic JAYA Algorithm with Application in Solving Economic Load Dispatch Problems , 2018, Energies.

[25]  N. S. Marimuthu,et al.  Intelligent approaches using support vector machine and extreme learning machine for transmission line protection , 2010, Neurocomputing.

[26]  Robert K. L. Gay,et al.  Error Minimized Extreme Learning Machine With Growth of Hidden Nodes and Incremental Learning , 2009, IEEE Transactions on Neural Networks.

[27]  Frédéric Saubion,et al.  A multi-population algorithm for multi-objective knapsack problem , 2018, Appl. Soft Comput..

[28]  Habibollah Haron,et al.  Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[29]  Vincent T. Y. Ng,et al.  A Hierarchical Ensemble of ECOC for cancer classification based on multi-class microarray data , 2016, Inf. Sci..

[30]  Dianhui Wang,et al.  Evolutionary extreme learning machine ensembles with size control , 2013, Neurocomputing.

[31]  Jacek M. Zurada,et al.  Introduction to artificial neural systems , 1992 .

[32]  Chee Kheong Siew,et al.  Extreme learning machine: RBF network case , 2004, ICARCV 2004 8th Control, Automation, Robotics and Vision Conference, 2004..

[33]  Xu Chen,et al.  Parameters identification of photovoltaic models using an improved JAYA optimization algorithm , 2017 .

[34]  Ravipudi Venkata Rao,et al.  Jaya: An Advanced Optimization Algorithm and its Engineering Applications , 2018 .

[35]  Alok Kumar Shukla,et al.  A two-stage gene selection method for biomarker discovery from microarray data for cancer classification , 2018, Chemometrics and Intelligent Laboratory Systems.

[36]  Simon C. K. Shiu,et al.  Metasample-Based Sparse Representation for Tumor Classification , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[37]  Sreejit Chakravarty,et al.  Microarray medical data classification using kernel ridge regression and modified cat swarm optimization based gene selection system , 2016, Swarm Evol. Comput..

[38]  Verónica Bolón-Canedo,et al.  Distributed feature selection: An application to microarray data classification , 2015, Appl. Soft Comput..

[39]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[40]  Madhubanti Maitra,et al.  Gene selection from microarray gene expression data for classification of cancer subgroups employing PSO and adaptive K-nearest neighborhood technique , 2015, Expert Syst. Appl..

[41]  Gil Alterovitz,et al.  Incremental wrapper based gene selection with Markov blanket , 2014, 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[42]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[43]  Jiawei Han,et al.  Generalized Fisher Score for Feature Selection , 2011, UAI.

[44]  Ramana V. Davuluri,et al.  Biomedical Informatics for Cancer Research , 2014 .

[45]  Namita Srivastava,et al.  A fuzzy based feature selection from independent component subspace for machine learning classification of microarray data , 2016, Genomics data.

[46]  Jun Li,et al.  Grey wolf optimization evolving kernel extreme learning machine: Application to bankruptcy prediction , 2017, Eng. Appl. Artif. Intell..

[47]  P. D. Heermann,et al.  Classification of multispectral remote sensing data using a back-propagation neural network , 1992, IEEE Trans. Geosci. Remote. Sens..

[48]  Hossein Nezamabadi-pour,et al.  A prototype classifier based on gravitational search algorithm , 2012, Appl. Soft Comput..

[49]  Leandro dos Santos Coelho,et al.  Firefly algorithm approach based on chaotic Tinkerbell map applied to multivariable PID controller tuning , 2012, Comput. Math. Appl..

[50]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[51]  Huijuan Lu,et al.  A kernel extreme learning machine algorithm based on improved particle swam optimization , 2017, Memetic Comput..

[52]  M. Balafar,et al.  Gene selection for microarray cancer classification using a new evolutionary method employing artificial intelligence concepts. , 2017, Genomics.

[53]  Virendra P. Vishwakarma,et al.  GA based KELM Optimization for ECG Classification , 2020 .

[54]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[55]  Qiang Li,et al.  An Enhanced Grey Wolf Optimization Based Feature Selection Wrapped Kernel Extreme Learning Machine for Medical Diagnosis , 2017, Comput. Math. Methods Medicine.

[56]  Luigi Fortuna,et al.  Chaotic sequences to improve the performance of evolutionary algorithms , 2003, IEEE Trans. Evol. Comput..

[57]  Carlo Di Bello,et al.  PCA disjoint models for multiclass cancer analysis using gene expression data , 2003, Bioinform..

[58]  E. Petricoin,et al.  Use of proteomic patterns in serum to identify ovarian cancer , 2002, The Lancet.

[59]  Hala Alshamlan,et al.  mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling , 2015, BioMed research international.

[60]  De-Shuang Huang,et al.  A Gene Selection Method for Microarray Data Based on Binary PSO Encoding Gene-to-Class Sensitivity Information , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[61]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[62]  Ujjwal Maulik Analysis of gene microarray data in a soft computing framework , 2011, Appl. Soft Comput..