Gene Selection and Classification in Microarray Datasets using a Hybrid Approach of PCC-BPSO/GA with Multi Classifiers

In this study, a three-phase hybrid approach is proposed for the selection and classification of high dimensional microarray data. The method uses Pearson’s Correlation Coefficient (PCC) in combination with Binary Particle Swarm Optimization (BPSO) or Genetic Algorithm (GA) along with various classifiers, thereby forming a PCC-BPSO/GA-multi classifiers approach. As such, five various classifiers are employed in the final stage of the classification. It was noticed that the PCC filter showed a remarkable improvement in the classification accuracy when it was combined with BPSO or GA. This positive impact was seen to be varied for different datasets based on the final applied classifier. The performance of various combination of the hybrid technique was compared in terms of accuracy and number of selected genes. In addition to the fact that BPSO is working faster than GA, it was noticed that BPSO has better performance than GA when it is combined with PCC feature selection.

[1]  Ghada Hany Badr,et al.  Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification , 2015, Comput. Biol. Chem..

[2]  Anne M. P. Canuto,et al.  Filter-based optimization techniques for selection of feature subsets in ensemble systems , 2014, Expert Syst. Appl..

[3]  Enrique Alba,et al.  Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms , 2007, 2007 IEEE Congress on Evolutionary Computation.

[4]  Madhubanti Maitra,et al.  Gene selection from microarray gene expression data for classification of cancer subgroups employing PSO and adaptive K-nearest neighborhood technique , 2015, Expert Syst. Appl..

[5]  Verónica Bolón-Canedo,et al.  A review of feature selection methods on synthetic data , 2013, Knowledge and Information Systems.

[6]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[7]  Duoqian Miao,et al.  A rough set approach to feature selection based on ant colony optimization , 2010, Pattern Recognit. Lett..

[8]  Joe Naoum-Sawaya,et al.  High dimensional data classification and feature selection using support vector machines , 2018, Eur. J. Oper. Res..

[9]  Stanislaw Osowski,et al.  Data mining for feature selection in gene expression autism data , 2015, Expert Syst. Appl..

[10]  David E. Goldberg,et al.  Genetic algorithms and Machine Learning , 1988, Machine Learning.

[11]  Hugues Bersini,et al.  A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[12]  Russell C. Eberhart,et al.  A discrete binary version of the particle swarm algorithm , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[13]  P. Conilione,et al.  A Comparative Study on Feature Selection for E . coli Promoter Recognition A Comparative Study on Feature Selection for E . coli Promoter Recognition , 2006 .

[14]  Rasmita Dash,et al.  An Adaptive Harmony Search Approach for Gene Selection and Classification of High Dimensional Medical Data , 2018, J. King Saud Univ. Comput. Inf. Sci..

[15]  Reyes Juárez-Ramírez,et al.  Filter feature selection performance comparison in high-dimensional data: A theoretical and empirical analysis of most popular algorithms , 2014, 17th International Conference on Information Fusion (FUSION).

[16]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[17]  Nizamettin Aydin,et al.  Binary black hole algorithm for feature selection and classification on biological data , 2017, Appl. Soft Comput..

[18]  David J. Brown,et al.  A survey on computational intelligence approaches for predictive modeling in prostate cancer , 2017, Expert Syst. Appl..

[19]  Lipo Wang,et al.  Feature selection in bioinformatics , 2012, Defense + Commercial Sensing.

[20]  Prashanth Suravajhala,et al.  Gene selection for tumor classification using a novel bio-inspired multi-objective approach. , 2018, Genomics.

[21]  Sohail Asghar,et al.  A REVIEW OF FEATURE SELECTION TECHNIQUES IN STRUCTURE LEARNING , 2013 .

[22]  M. Sivabalakrishnan,et al.  Feature Selection of Gene Expression Data for Cancer Classification: A Review , 2015 .

[23]  Simone A. Ludwig,et al.  Classification of Cancer Data: Analyzing Gene Expression Data Using a Fuzzy Decision Tree Algorithm , 2018 .

[24]  Fatima Ardjani,et al.  Optimization of SVM MultiClass by Particle Swarm (PSO-SVM) , 2010, 2010 2nd International Workshop on Database Technology and Applications.

[25]  Yu Xue,et al.  A hybrid feature selection algorithm for gene expression data classification , 2017, Neurocomputing.

[26]  Jin-Kao Hao,et al.  A Hybrid GA/SVM Approach for Gene Selection and Classification of Microarray Data , 2006, EvoWorkshops.

[27]  Lluís A. Belanche Muñoz,et al.  Feature Selection for Microarray Gene Expression Data using Simulated Annealing guided by the Multivariate Joint Entropy , 2013, ArXiv.

[28]  Pa-Chun Wang,et al.  Particle swarm optimization for feature selection with application in obstructive sleep apnea diagnosis , 2011, Neural Computing and Applications.

[29]  Haider Banka,et al.  Cancer microarray data feature selection using multi-objective binary particle swarm optimization algorithm , 2016, EXCLI journal.

[30]  Vinod Kumar Jain,et al.  Correlation feature selection based improved-Binary Particle Swarm Optimization for gene selection and cancer classification , 2018, Appl. Soft Comput..

[31]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[32]  S. Stigler Francis Galton's Account of the Invention of Correlation , 1989 .

[33]  Peter Bühlmann,et al.  Supervised clustering of genes , 2002, Genome Biology.

[34]  Rohayanti Hassan,et al.  Selection and classification of gene expression in autism disorder: Use of a combination of statistical filters and a GBPSO-SVM algorithm , 2017, PloS one.

[35]  Duncan Fyfe Gillies,et al.  A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data , 2015, Adv. Bioinformatics.

[36]  Mengjie Zhang,et al.  Improved PSO for Feature Selection on High-Dimensional Datasets , 2014, SEAL.

[37]  Zexuan Zhu,et al.  Markov blanket-embedded genetic algorithm for gene selection , 2007, Pattern Recognit..

[38]  Namita Srivastava,et al.  Artificial neural network classification of microarray data using new hybrid gene selection method , 2017, Int. J. Data Min. Bioinform..

[39]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[40]  Richard Weber,et al.  Feature selection for high-dimensional class-imbalanced data sets using Support Vector Machines , 2014, Inf. Sci..

[41]  Verónica Bolón-Canedo,et al.  A review of microarray datasets and applied feature selection methods , 2014, Inf. Sci..

[42]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[43]  M. Xiong,et al.  Biomarker Identification by Feature Wrappers , 2022 .