Hybrid feature selection algorithm using symmetrical uncertainty and a harmony search algorithm

Microarray technology can be used as an efficient diagnostic system to recognise diseases such as tumours or to discriminate between different types of cancers in normal tissues. This technology has received increasing attention from the bioinformatics community because of its potential in designing powerful decision-making tools for cancer diagnosis. However, the presence of thousands or tens of thousands of genes affects the predictive accuracy of this technology from the perspective of classification. Thus, a key issue in microarray data is identifying or selecting the smallest possible set of genes from the input data that can achieve good predictive accuracy for classification. In this work, we propose a two-stage selection algorithm for gene selection problems in microarray data-sets called the symmetrical uncertainty filter and harmony search algorithm wrapper (SU-HSA). Experimental results show that the SU-HSA is better than HSA in isolation for all data-sets in terms of the accuracy and achieves a lower number of genes on 6 out of 10 instances. Furthermore, the comparison with state-of-the-art methods shows that our proposed approach is able to obtain 5 (out of 10) new best results in terms of the number of selected genes and competitive results in terms of the classification accuracy.

[1]  Li-Yeh Chuang,et al.  A hybrid feature selection method for DNA microarray data , 2011, Comput. Biol. Medicine.

[2]  Zong Woo Geem,et al.  Music Composition Using Harmony Search Algorithm , 2009, EvoWorkshops.

[3]  Antje Baer Harmony Search Algorithms For Structural Design Optimization , 2016 .

[4]  Dhanesh Ramachandram,et al.  Dynamic fuzzy clustering using Harmony Search with application to image segmentation , 2009, 2009 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT).

[5]  K. Lee,et al.  A new structural optimization method based on the harmony search algorithm , 2004 .

[6]  Mohammed Azmi Al-Betar,et al.  A Harmony Search with Multi-pitch Adjusting Rate for the University Course Timetabling , 2010, Recent Advances In Harmony Search Algorithm.

[7]  Rajni Bala,et al.  A Hybrid Approach for Selection of Relevant Features for Microarray Datasets , 2007 .

[8]  Mohammed El-Abd,et al.  Performance assessment of foraging algorithms vs. evolutionary algorithms , 2012, Inf. Sci..

[9]  Azah Mohamed,et al.  Optimal allocation of shunt Var compensators in power systems using a novel global harmony search algorithm , 2012 .

[10]  Sophie Schbath,et al.  Separating Significant Matches from Spurious Matches in DNA Sequences , 2012, J. Comput. Biol..

[11]  Katharina Burger,et al.  Harmony Search Algorithms For Structural Design Optimization , 2016 .

[12]  Hongbin Zhang,et al.  Feature selection using tabu search method , 2002, Pattern Recognit..

[13]  Mahmoud R. Maheri,et al.  An enhanced harmony search algorithm for optimum design of side sway steel frames , 2014 .

[14]  Jose Miguel Puerta,et al.  A GRASP algorithm for fast hybrid (filter-wrapper) feature subset selection in high-dimensional datasets , 2011, Pattern Recognit. Lett..

[15]  Silvia Casado Yusta,et al.  Different metaheuristic strategies to solve the feature selection problem , 2009, Pattern Recognit. Lett..

[16]  Mandava Rajeswari,et al.  A hybrid harmony search algorithm for MRI brain segmentation , 2010, 9th IEEE International Conference on Cognitive Informatics (ICCI'10).

[17]  Jiawei Han,et al.  Feature selection using dynamic weights for classification , 2013, Knowl. Based Syst..

[18]  Jin-Kao Hao,et al.  Advances in metaheuristics for gene selection and classification of microarray data , 2010, Briefings Bioinform..

[19]  Jesús S. Aguilar-Ruiz,et al.  Incremental wrapper-based gene selection from microarray data for cancer classification , 2006, Pattern Recognit..

[20]  Geoffrey J McLachlan,et al.  Selection bias in gene extraction on the basis of microarray gene-expression data , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Mohammad Reza Meybodi,et al.  Efficient stochastic algorithms for document clustering , 2013, Inf. Sci..

[22]  Carlos García-Martínez,et al.  Hybrid metaheuristics with evolutionary algorithms specializing in intensification and diversification: Overview and progress report , 2010, Comput. Oper. Res..

[23]  Zong Woo Geem,et al.  A New Heuristic Optimization Algorithm: Harmony Search , 2001, Simul..

[24]  Iyad Abu Doush,et al.  Hybridizing Harmony Search algorithm with different mutation operators for continuous problems , 2014, Appl. Math. Comput..

[25]  Li-Yeh Chuang,et al.  A Hybrid BPSO-CGA Approach for Gene Selection and Classification of Microarray Data , 2012, J. Comput. Biol..

[26]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[27]  M. Fesanghary,et al.  An improved harmony search algorithm for solving optimization problems , 2007, Appl. Math. Comput..

[28]  Hossein Nezamabadi-pour,et al.  An Improved Multi-Objective Harmony Search for Optimal Placement of DGs in Distribution Systems , 2013, IEEE Transactions on Smart Grid.

[29]  M. Tamer Ayvaz,et al.  Application of Harmony Search algorithm to the solution of groundwater management models , 2009 .

[30]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[31]  El-Ghazali Talbi,et al.  Comparison of population based metaheuristics for feature selection: Application to microarray data classification , 2008, 2008 IEEE/ACS International Conference on Computer Systems and Applications.

[32]  M. Fesanghary,et al.  Design optimization of shell and tube heat exchangers using global sensitivity analysis and harmony search algorithm , 2009 .

[33]  Tao Li,et al.  A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression , 2004, Bioinform..

[34]  Driss Aboutajdine,et al.  A two-stage gene selection scheme utilizing MRMR filter and GA wrapper , 2011, Knowledge and Information Systems.

[35]  M. Fesanghary,et al.  Combined heat and power economic dispatch by harmony search algorithm , 2007 .

[36]  D. Ramachandram,et al.  Harmony search-based cluster initialization for fuzzy c-means segmentation of MR images , 2009, TENCON 2009 - 2009 IEEE Region 10 Conference.

[37]  M. Tamer Ayvaz,et al.  Simultaneous determination of aquifer parameters and zone structures with fuzzy c-means clustering and meta-heuristic harmony search algorithm , 2007 .

[38]  Mohammed Azmi Al-Betar,et al.  A harmony search algorithm for university course timetabling , 2010, Annals of Operations Research.

[39]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[40]  Jin-Kao Hao,et al.  A memetic algorithm for gene selection and molecular classification of cancer , 2009, GECCO.

[41]  Y. M. Cheng,et al.  An improved harmony search minimization algorithm using different slip surface generation methods for slope stability analysis , 2008 .

[42]  Li-Yeh Chuang,et al.  Gene selection and classification using Taguchi chaotic binary particle swarm optimization , 2011, Expert Syst. Appl..

[43]  Aditya Panchal,et al.  Harmony Search in Therapeutic Medical Physics , 2009 .

[44]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[45]  Zong Woo Geem,et al.  Harmony Search Algorithm for Solving Sudoku , 2007, KES.

[46]  Edoardo Amaldi,et al.  On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems , 1998, Theor. Comput. Sci..

[47]  Shih-Wei Lin,et al.  Parameter determination and feature selection for C4.5 algorithm using scatter search approach , 2012, Soft Comput..

[48]  Raymond Chiong,et al.  Special issue on modern search heuristics and applications , 2011, Evol. Intell..

[49]  Cheng-Lung Huang,et al.  ACO-based hybrid classification system with feature subset selection and model parameters optimization , 2009, Neurocomputing.

[50]  Bijaya K. Panigrahi,et al.  Dynamic economic load dispatch using hybrid swarm intelligence based harmony search algorithm , 2011, Expert Syst. Appl..

[51]  Sangyum Lee,et al.  Improving a model for the dynamic modulus of asphalt using the modified harmony search algorithm , 2014, Expert Syst. Appl..

[52]  Georgios C. Anagnostopoulos,et al.  Knowledge-Based Intelligent Information and Engineering Systems , 2003, Lecture Notes in Computer Science.

[53]  Li-Yeh Chuang,et al.  Tabu Search and Binary Particle Swarm Optimization for Feature Selection Using Microarray Data , 2009, J. Comput. Biol..

[54]  Zexuan Zhu,et al.  Markov blanket-embedded genetic algorithm for gene selection , 2007, Pattern Recognit..

[55]  N. Ramaraj,et al.  A novel hybrid feature selection via Symmetrical Uncertainty ranking based local memetic search algorithm , 2010, Knowl. Based Syst..

[56]  Ali R. Yildiz,et al.  A comparative study of population-based optimization algorithms for turning operations , 2012, Inf. Sci..

[57]  Panos M. Pardalos,et al.  An improved adaptive binary Harmony Search algorithm , 2013, Inf. Sci..

[58]  Morteza Haghir Chehreghani,et al.  Novel meta-heuristic algorithms for clustering web documents , 2008, Appl. Math. Comput..

[59]  Z. Geem Particle-swarm harmony search for water network design , 2009 .