Co-ABC: Correlation artificial bee colony algorithm for biomarker gene discovery using gene expression profile

In this paper, we propose a new hybrid method based on Correlation-based feature selection method and Artificial Bee Colony algorithm,namely Co-ABC to select a small number of relevant genes for accurate classification of gene expression profile. The Co-ABC consists of three stages which are fully cooperated: The first stage aims to filter noisy and redundant genes in high dimensionality domains by applying Correlation-based feature Selection (CFS) filter method. In the second stage, Artificial Bee Colony (ABC) algorithm is used to select the informative and meaningful genes. In the third stage, we adopt a Support Vector Machine (SVM) algorithm as classifier using the preselected genes form second stage. The overall performance of our proposed Co-ABC algorithm was evaluated using six gene expression profile for binary and multi-class cancer datasets. In addition, in order to proof the efficiency of our proposed Co-ABC algorithm, we compare it with previously known related methods. Two of these methods was re-implemented for the sake of a fair comparison using the same parameters. These two methods are: Co-GA, which is CFS combined with a genetic algorithm GA. The second one named Co-PSO, which is CFS combined with a particle swarm optimization algorithm PSO. The experimental results shows that the proposed Co-ABC algorithm acquire the accurate classification performance using small number of predictive genes. This proofs that Co-ABC is a efficient approach for biomarker gene discovery using cancer gene expression profile.

[1]  Rosni Abdullah,et al.  Protein Tertiary Structure Prediction Using Artificial Bee Colony Algorithm , 2009, 2009 Third Asia International Conference on Modelling & Simulation.

[2]  Andrew Y. Ng,et al.  Preventing "Overfitting" of Cross-Validation Data , 1997, ICML.

[3]  Ghada Hany Badr,et al.  A Comparative Study of Cancer Classification Methods Using Microarray Gene Expression Profile , 2013, DaEng.

[4]  Carlos J. Alonso,et al.  Microarray gene expression classification with few genes: Criteria to combine attribute selection and classification methods , 2012, Expert Syst. Appl..

[5]  Wei Kong,et al.  A combination of modified particle swarm optimization algorithm and support vector machine for gene selection and tumor classification. , 2007, Talanta.

[6]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[7]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[8]  Seyed Mohammad Hosseini,et al.  A Novel Weighted Support Vector Machine Based on Particle Swarm Optimization for Gene Selection and Tumor Classification , 2012, Comput. Math. Methods Medicine.

[9]  Driss Aboutajdine,et al.  A New gene selection approach based on Minimum Redundancy-Maximum Relevance (MRMR) and Genetic Algorithm (GA) , 2009, 2009 IEEE/ACS International Conference on Computer Systems and Applications.

[10]  Dervis Karaboga,et al.  AN IDEA BASED ON HONEY BEE SWARM FOR NUMERICAL OPTIMIZATION , 2005 .

[11]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[12]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Yousef Al-Ohali,et al.  ABC-SVM: Artificial Bee Colony and SVM Method for Microarray Gene Selection and Multi Class Cancer Classification , 2016 .

[14]  Anirban Mukherjee,et al.  Multicategory cancer classification from gene expression data by multiclass NPPC ensemble , 2010, 2010 International Conference on Systems in Medicine and Biology.

[15]  Jin-Kao Hao,et al.  A Hybrid GA/SVM Approach for Gene Selection and Classification of Microarray Data , 2006, EvoWorkshops.

[16]  Xiaosheng Wang,et al.  Microarray-Based Cancer Prediction Using Soft Computing Approach , 2009, Cancer informatics.

[17]  Nurhan Karaboga,et al.  A new design method based on artificial bee colony algorithm for digital IIR filters , 2009, J. Frankl. Inst..

[18]  Wan-li Xiang,et al.  An efficient and robust artificial bee colony algorithm for numerical optimization , 2013, Comput. Oper. Res..

[19]  Jing Zhao,et al.  A Modified Ant Colony Optimization Algorithm for Tumor Marker Gene Selection , 2009, Genom. Proteom. Bioinform..

[20]  Michael R. Lyu,et al.  Gene Selection Based on Mutual Information for the Classification of Multi-class Cancer , 2006, ICIC.

[21]  Shinn-Ying Ho,et al.  Selecting a minimal number of relevant genes from microarray data to design accurate tissue classifiers , 2007, Biosyst..

[22]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[23]  E. Lander,et al.  MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia , 2002, Nature Genetics.

[24]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[25]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[26]  Yungho Leu,et al.  A novel hybrid feature selection method for microarray data analysis , 2011, Appl. Soft Comput..

[27]  Li-Yeh Chuang,et al.  A hybrid feature selection method for DNA microarray data , 2011, Comput. Biol. Medicine.

[28]  Li-Yeh Chuang,et al.  A Hybrid Feature Selection Method for Microarray Classification , 2022 .

[29]  D. Karaboga,et al.  Artificial Bee Colony (ABC) Algorithm on Training Artificial Neural Networks , 2007, 2007 IEEE 15th Signal Processing and Communications Applications.

[30]  Hala M. Alshamlan,et al.  A Study of Cancer Microarray Gene Expression Profile : Objectives and Approaches , 2013 .

[31]  Enrique Alba,et al.  Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms , 2007, 2007 IEEE Congress on Evolutionary Computation.

[32]  Davar Giveki,et al.  Automatic detection of erythemato-squamous diseases using PSO-SVM based on association rules , 2013, Eng. Appl. Artif. Intell..

[33]  Wei Du,et al.  Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines , 2003, FEBS letters.

[34]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[35]  David E. Misek,et al.  Gene-expression profiles predict survival of patients with lung adenocarcinoma , 2002, Nature Medicine.

[36]  Sayan Mukherjee,et al.  Classifying Microarray Data Using Support Vector Machines , 2003 .

[37]  Hui-Ling Huang,et al.  ESVM: Evolutionary support vector machine for automatic feature selection and classification of microarray data , 2007, Biosyst..