Gene Selection using Intelligent Dynamic Genetic Algorithm and Random Forest

Microarray gene expression data has provided a successful framework for investigating cancer and genetic diseases. Finding cancer-related genes using feature selection methods is of the greatest importance in microarray analysis. However, selecting a small number of informative genes is a challenging task due to the curse of dimensionality in the microarray dataset. This study introduces a new hybrid model based on the Intelligent Dynamic Genetic Algorithm (IDGA) and random forest to distinguish a small meaningful set of genes for cancer classification. This random forest- based IDGA algorithm uses not only random forest in filtering noisy and redundant genes but also in its fitness function. The proposed method was evaluated on two benchmark datasets, namely leukemia and colon cancer data and top explored genes were reported. Experimental results demonstrate that the suggested method has an excellent selection and classification performance compared to several recently proposed approaches.

[1]  Pugalendhi GaneshKumar,et al.  Fuzzy Expert System based on a Novel Hybrid Stem Cell (HSC) Algorithm for Classification of Micro Array Data , 2018, Journal of Medical Systems.

[2]  Salwani Abdullah,et al.  Hybrid feature selection algorithm using symmetrical uncertainty and a harmony search algorithm , 2016, Int. J. Syst. Sci..

[3]  Prashanth Suravajhala,et al.  Gene selection for tumor classification using a novel bio-inspired multi-objective approach. , 2018, Genomics.

[4]  X. Chen,et al.  Random forests for genomic data analysis. , 2012, Genomics.

[5]  Nizamettin Aydin,et al.  Gene selection and classification approach for microarray data based on Random Forest Ranking and BBHA , 2016, 2016 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI).

[6]  Nikola Bogunovic,et al.  A review of feature selection methods with applications , 2015, 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).

[7]  Gamal Attiya,et al.  Classification of human cancer diseases by gene expression profiles , 2017, Appl. Soft Comput..

[8]  M. Balafar,et al.  Gene selection for microarray cancer classification using a new evolutionary method employing artificial intelligence concepts. , 2017, Genomics.

[9]  Yu Xue,et al.  A hybrid feature selection algorithm for gene expression data classification , 2017, Neurocomputing.

[10]  Constantin F. Aliferis,et al.  A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification , 2008, BMC Bioinformatics.

[11]  Vinod Kumar Jain,et al.  Correlation feature selection based improved-Binary Particle Swarm Optimization for gene selection and cancer classification , 2018, Appl. Soft Comput..

[12]  Ali Najafi,et al.  A hybrid gene selection algorithm for microarray cancer classification using genetic algorithm and learning automata , 2017 .

[13]  Václav Snásel,et al.  Large-dimensionality small-instance set feature selection: A hybrid bio-inspired heuristic approach , 2018, Swarm Evol. Comput..

[14]  Hala Alshamlan,et al.  mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling , 2015, BioMed research international.

[15]  Namita Srivastava,et al.  A novel approach for dimension reduction of microarray , 2017, Comput. Biol. Chem..

[16]  Nada Almugren,et al.  A Survey on Hybrid Feature Selection Methods in Microarray Gene Expression Data for Cancer Classification , 2019, IEEE Access.

[17]  Praveen Tumuluru,et al.  GOA-based DBN : Grasshopper Optimization Algorithm-based Deep Belief Neural Networks for Cancer Classification , 2017 .

[18]  Mohammad Hossein Moattar,et al.  A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization. , 2016, Genomics.

[19]  Parham Moradi,et al.  A hybrid particle swarm optimization for feature subset selection by integrating a novel local search strategy , 2016, Appl. Soft Comput..

[20]  Nizamettin Aydin,et al.  Splice site identification in human genome using random forest , 2016, Health and Technology.

[21]  Souad Guessoum,et al.  Fast correlation based filter combined with genetic algorithm and particle swarm on feature selection , 2017, 2017 5th International Conference on Electrical Engineering - Boumerdes (ICEE-B).

[22]  Ram Sarkar,et al.  Genetic algorithm based cancerous gene identification from microarray data using ensemble of filter methods , 2018, Medical & Biological Engineering & Computing.

[23]  Nizamettin Aydin,et al.  Markovian encoding models in human splice site recognition using SVM , 2018, Comput. Biol. Chem..

[24]  Claudio De Stefano,et al.  An Experimental Comparison of Feature-Selection and Classification Methods for Microarray Datasets , 2019, Inf..

[25]  J Yang,et al.  Applying the Fisher score to identify Alzheimer's disease-related genes. , 2016, Genetics and molecular research : GMR.