A Hybrid Filter-Wrapper Gene Selection Method for Cancer Classification

The advent of DNA microarray technology has paved the way to providing increased opportunities to the molecular biologists to analyze the expression level of thousands of genes (features) in one experiment. The gene expression level provides the possibility of diagnosing various diseases such as cancer. In this regard, several computational techniques such as pattern classification approaches can be applied. However, the existence of a huge quantity of genes and very few patients' samples available hinders the classifier or machine learning techniques from producing accurate classification results. Most of these genes are irrelevant and redundant, which may deteriorate the classification performance. Therefore, gene selection is needed to select the most relevant genes. This paper proposes hybrid filter-wrapper gene selection method using Minimum Redundancy Maximum Relevancy (MRMR) as the filter approach and flower pollination algorithm (FPA) as the wrapper approach. MRMR was used to find the most important genes from all genes in the gene expression data, and FPA is employed in order to locate the most informative gene subset from the reduce set that obtained by MRMR. To test the accuracy and performance of the study's proposed method, extensive experiments are conducted and three microarray datasets are used. They include Colon, Breast, and Ovarian. A similar procedure has been performed on the Genetic algorithm (GA) in comparison with the proposed method (MRMR-FPA) in this study. The results concluded that the MRMR-FPA can be used as an alternative method to address the gene selection problem.

[1]  Zexuan. Zhu,et al.  Memetic algorithms for feature/gene selection , 2007 .

[2]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[3]  Gamal Attiya,et al.  Classification of human cancer diseases by gene expression profiles , 2017, Appl. Soft Comput..

[4]  Driss Aboutajdine,et al.  A two-stage gene selection scheme utilizing MRMR filter and GA wrapper , 2011, Knowledge and Information Systems.

[5]  Mohammed Azmi Al-Betar,et al.  Gene selection for cancer classification by combining minimum redundancy maximum relevancy and bat-inspired algorithm , 2017, Int. J. Data Min. Bioinform..

[6]  Mohammed Azmi Al-Betar,et al.  ECG signal denoising using β-hill climbing algorithm and wavelet transform , 2017, 2017 8th International Conference on Information Technology (ICIT).

[7]  João Paulo Papa,et al.  EEG-based Person Authentication Using Multi-objective Flower Pollination Algorithm , 2018, 2018 IEEE Congress on Evolutionary Computation (CEC).

[8]  Li-Yeh Chuang,et al.  A Hybrid BPSO-CGA Approach for Gene Selection and Classification of Microarray Data , 2012, J. Comput. Biol..

[9]  Kuo-Chen Chou,et al.  Prediction of Protein Domain with mRMR Feature Selection and Analysis , 2012, PloS one.

[10]  Minghao Yin,et al.  Multiobjective Binary Biogeography Based Optimization for Feature Selection Using Gene Expression Data , 2013, IEEE Transactions on NanoBioscience.

[11]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[12]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[13]  Ghada Hany Badr,et al.  Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification , 2015, Comput. Biol. Chem..

[14]  Mohammed Azmi Al-Betar,et al.  Gray image enhancement using harmony search , 2016, Int. J. Comput. Intell. Syst..

[15]  L. Abualigah,et al.  MRMR BA : A HYBRID GENE SELECTION ALGORITHM FOR CANCER CLASSIFICATION , 2017 .

[16]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Mohammed Azmi Al-Betar,et al.  An Efficient Optimization Technique of EEG Decomposition for User Authentication System , 2018, 2018 2nd International Conference on BioSignal Analysis, Processing and Systems (ICBAPS).

[18]  K.Z. Mao,et al.  Orthogonal forward selection and backward elimination algorithms for feature subset selection , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[19]  Mohammed Azmi Al-Betar,et al.  β\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta$$\end{document}-Hill climbing: an exploratory local search , 2016, Neural Computing and Applications.

[20]  Edoardo Amaldi,et al.  On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems , 1998, Theor. Comput. Sci..

[21]  Jin-Kao Hao,et al.  A memetic algorithm for gene selection and molecular classification of cancer , 2009, GECCO.

[22]  Wei-Chang Yeh,et al.  Gene selection using information gain and improved simplified swarm optimization , 2016, Neurocomputing.

[23]  Mohammed Azmi Al-Betar,et al.  Edge preserving image enhancement via harmony search algorithm , 2012, 2012 4th Conference on Data Mining and Optimization (DMO).

[24]  Xin-She Yang,et al.  Flower Pollination Algorithm for Global Optimization , 2012, UCNC.

[25]  Xin-She Yang,et al.  Variants of the Flower Pollination Algorithm: A Review , 2018 .

[26]  Mohammed Azmi Al-Betar,et al.  Feature Selection with β-Hill Climbing Search for Text Clustering Application , 2017, 2017 Palestinian International Conference on Information and Communication Technology (PICICT).