A novel filter-wrapper hybrid gene selection approach for microarray data based on multi-objective forest optimization algorithm

Article history: Received April 5, 2020 Received in revised format: May 9, 2020 Accepted May 29 2020 Available online May 29, 2020 One of the most important solutions for dimensionality reduction in data preprocessing, and improving classification performance is gene selection in microarray data since they usually have several thousand genes with very few samples. Because of these characteristics, the complexity of classification models increases and their efficiency decreases. The gene selection problem inherently pursues two goals: reducing the number of genes and increasing the classification efficiency. Therefore, this paper presents a novel hybrid filter-wrapper solution based on the Fisher-score method and Multi-Objective Forest Optimization Algorithm (MOFOA). In the proposed method, as a preprocessing step, the Fisher-score method selects 500 discriminative genes by removing redundant/irrelevant genes. Then, MOFOA searches to find the subset of optimal genes using concepts such as repository, crowding-distance, and binary tournament selection. Moreover, the proposed method solves the gene selection problem and, at the same time, optimizes the kernel parameters in the SVM classification model. Six microarray datasets were used to evaluate the performance of the proposed method. Afterward, a comparison was made between its results and those of the four multi-objective hybrid methods presented in the literature in terms of classification performance, the number of selected genes, running time, and hypervolume criteria. According to the results, in addition to selecting fewer genes, the proposed solution has achieved greater classification accuracy in most cases and has been able to obtain a performance similar to or better than that of other multi-objective gene selection approaches. . by the authors; licensee Growing Science, Canada 20 20 ©

[1]  Pradeep Singh,et al.  Gene selection for cancer types classification using novel hybrid metaheuristics approach , 2020, Swarm Evol. Comput..

[2]  Mengjie Zhang,et al.  A survey on swarm intelligence approaches to feature selection in data mining , 2020, Swarm Evol. Comput..

[3]  Verónica Bolón-Canedo,et al.  A review of feature selection methods in medical applications , 2019, Comput. Biol. Medicine.

[4]  Rinkle Rani,et al.  C-HMOSHSSA: Gene selection for cancer classification using multi-objective meta-heuristic and machine learning methods , 2019, Comput. Methods Programs Biomed..

[5]  Nagamma Patil,et al.  A novel filter-wrapper hybrid greedy ensemble approach optimized using the genetic algorithm to reduce the dimensionality of high-dimensional biomedical datasets , 2019, Appl. Soft Comput..

[6]  Sambit Bakshi,et al.  Analysis of high-dimensional biomedical data using an evolutionary multi-objective emperor penguin optimizer , 2019, Swarm Evol. Comput..

[7]  Nada Almugren,et al.  A Survey on Hybrid Feature Selection Methods in Microarray Gene Expression Data for Cancer Classification , 2019, IEEE Access.

[8]  Khan Muhammad,et al.  Analysis of high-dimensional genomic data employing a novel bio-inspired algorithm , 2019, Appl. Soft Comput..

[9]  Chyh-Ming Lai,et al.  Multi-objective simplified swarm optimization with weighting scheme for gene selection , 2018, Appl. Soft Comput..

[10]  Anshuman Panda,et al.  A Modern Approach for Load Balancing Using Forest Optimization Algorithm , 2018, 2018 Second International Conference on Computing Methodologies and Communication (ICCMC).

[11]  Mohammad Sadegh Helfroush,et al.  Gene selection from large-scale gene expression data based on fuzzy interactive multi-objective binary optimization for medical diagnosis , 2018 .

[12]  Alok Kumar Shukla,et al.  A hybrid gene selection method for microarray recognition , 2018 .

[13]  Prashanth Suravajhala,et al.  Gene selection for tumor classification using a novel bio-inspired multi-objective approach. , 2018, Genomics.

[14]  Mengjie Zhang,et al.  Pareto front feature selection based on artificial bee colony optimization , 2018, Inf. Sci..

[15]  Yu Xue,et al.  A hybrid feature selection algorithm for gene expression data classification , 2017, Neurocomputing.

[16]  S. Ratnoo,et al.  Dimension reduction for microarray data using multi-objective ant colony optimisation , 2017 .

[17]  M. Balafar,et al.  Gene selection for microarray cancer classification using a new evolutionary method employing artificial intelligence concepts. , 2017, Genomics.

[18]  Ali Najafi,et al.  A hybrid gene selection algorithm for microarray cancer classification using genetic algorithm and learning automata , 2017 .

[19]  Mohammad-Reza Feizi-Derakhshi,et al.  Feature selection using Forest Optimization Algorithm , 2016, Pattern Recognit..

[20]  Abul Hasnat,et al.  Feature selection in cancer microarray data using multi-objective genetic algorithm combined with correlation coefficient , 2016, 2016 International Conference on Emerging Technological Trends (ICETT).

[21]  Haider Banka,et al.  Cancer microarray data feature selection using multi-objective binary particle swarm optimization algorithm , 2016, EXCLI journal.

[22]  Mohammad Hossein Moattar,et al.  A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization. , 2016, Genomics.

[23]  Enrique Alba,et al.  Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments , 2016, Appl. Soft Comput..

[24]  Beatriz de la Iglesia,et al.  Survey on Feature Selection , 2015, ArXiv.

[25]  Mohammad-Reza Feizi-Derakhshi,et al.  Forest Optimization Algorithm , 2014, Expert Syst. Appl..

[26]  Samina Khalid,et al.  A survey of feature selection and feature extraction techniques in machine learning , 2014, 2014 Science and Information Conference.

[27]  Verónica Bolón-Canedo,et al.  A review of microarray datasets and applied feature selection methods , 2014, Inf. Sci..

[28]  Ujjwal Maulik,et al.  A Survey of Multiobjective Evolutionary Algorithms for Data Mining: Part I , 2014, IEEE Transactions on Evolutionary Computation.

[29]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[30]  Pablo A. Estévez,et al.  A review of feature selection methods based on mutual information , 2013, Neural Computing and Applications.

[31]  Minghao Yin,et al.  Multiobjective Binary Biogeography Based Optimization for Feature Selection Using Gene Expression Data , 2013, IEEE Transactions on NanoBioscience.

[32]  Basabi Chakraborty,et al.  Multi-objective Optimization Using Pareto GA for Gene-Selection from Microarray Data for Disease Classification , 2013, 2013 IEEE International Conference on Systems, Man, and Cybernetics.

[33]  Anju Mishra,et al.  A Survey on Different Feature Selection Methods for Microarray Data Analysis , 2013 .

[34]  Debahuti Mishra,et al.  A New Meta-heuristic Bat Inspired Classification Approach for Microarray Data , 2012 .

[35]  Jiawei Han,et al.  Generalized Fisher Score for Feature Selection , 2011, UAI.

[36]  Li-Yeh Chuang,et al.  A hybrid feature selection method for DNA microarray data , 2011, Comput. Biol. Medicine.

[37]  Yungho Leu,et al.  A novel hybrid feature selection method for microarray data analysis , 2011, Appl. Soft Comput..

[38]  Li-Yeh Chuang,et al.  IG-GA: A Hybrid Filter/Wrapper Method for Feature Selection of Microarray Data , 2010 .

[39]  Li-Yeh Chuang,et al.  Tabu Search and Binary Particle Swarm Optimization for Feature Selection Using Microarray Data , 2009, J. Comput. Biol..

[40]  Anne Auger,et al.  Theory of the hypervolume indicator: optimal μ-distributions and the choice of the reference point , 2009, FOGA '09.

[41]  R. K. Ursem Multi-objective Optimization using Evolutionary Algorithms , 2009 .

[42]  Frank Neumann,et al.  Analyzing Hypervolume Indicator Based Algorithms , 2008, PPSN.

[43]  Rajagopalan Srinivasan,et al.  Principal components analysis based methodology to identify differentially expressed genes in time-course microarray data , 2008, BMC Bioinformatics.

[44]  S. Omatu,et al.  Multi-objective optimization using genetic algorithm for gene selection from microarray data , 2008, 2008 International Conference on Computer and Communication Engineering.

[45]  Wei Kong,et al.  Hybrid particle swarm optimization and tabu search approach for selecting genes for tumor classification using gene expression data , 2008, Comput. Biol. Chem..

[46]  Zexuan Zhu,et al.  Markov blanket-embedded genetic algorithm for gene selection , 2007, Pattern Recognit..

[47]  Broderick Crawford,et al.  Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) , 2007 .

[48]  Sushmita Mitra,et al.  Evolutionary Rough Feature Selection in Gene Expression Data , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[49]  J. Douglas Barrett,et al.  Taguchi's Quality Engineering Handbook , 2007, Technometrics.

[50]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[51]  C. Coello,et al.  Improving PSO-based Multi-Objective Optimization using Crowding , Mutation and �-Dominance , 2005 .

[52]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[53]  Mikkel T. Jensen,et al.  Reducing the run-time complexity of multiobjective EAs: The NSGA-II and other algorithms , 2003, IEEE Trans. Evol. Comput..

[54]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[55]  Gary B. Lamont,et al.  Evolutionary Algorithms for Solving Multi-Objective Problems , 2002, Genetic Algorithms and Evolutionary Computation.

[56]  C.A. Coello Coello,et al.  MOPSO: a proposal for multiple objective particle swarm optimization , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[57]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[58]  Edoardo Amaldi,et al.  On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems , 1998, Theor. Comput. Sci..

[59]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..