Clustering in Conjunction with Quantum Genetic Algorithm for Relevant Genes Selection for Cancer Microarray Data

Quantum Genetic Algorithm, which utilizes the principle of quantum computing and genetic operators, allows efficient exploration and exploitation of large search space simultaneously. It has been used recently to determine a reduced set of features for cancer microarray data to improve the performance of the learning system. However, the length of the chromosome used is the original dimension of the feature vector. Hence, despite the use of the quantum variant of GA, it requires huge memory and computation time for high dimensional data like microarrays. In this paper, we propose a two phase approach, ClusterQGA, that determines a minimal set of relevant and non-redundant genes. Experimental results on publicly available cancer microarray datasets demonstrate the effectiveness of the proposed approach in comparison to existing methods in terms of classification accuracy and number of features. Also, the proposed approach takes less computation time in comparison to Genetic quantum algorithm proposed by Abderrahim et al.

[1]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[2]  Carlos J. Alonso,et al.  Microarray gene expression classification with few genes: Criteria to combine attribute selection and classification methods , 2012, Expert Syst. Appl..

[3]  N. Hashimoto,et al.  Gene Expression-Based Molecular Diagnostic System for Malignant Gliomas Is Superior to Histological Diagnosis , 2007, Clinical Cancer Research.

[4]  Huan Liu,et al.  Redundancy based feature selection for microarray data , 2004, KDD.

[5]  Jianzhong Li,et al.  A stable gene selection in microarray data analysis , 2006, BMC Bioinformatics.

[6]  R Kahavi,et al.  Wrapper for feature subset selection , 1997 .

[7]  Gexiang Zhang,et al.  Quantum Computing Based Machine Learning Method and Its Application in Radar Emitter Signal Recognition , 2004, MDAI.

[8]  Michael I. Jordan,et al.  Feature selection for high-dimensional genomic microarray data , 2001, ICML.

[9]  H L Yu,et al.  Multiclass microarray data classification based on confidence evaluation. , 2012, Genetics and molecular research : GMR.

[10]  Wei Du,et al.  Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines , 2003, FEBS letters.

[11]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[12]  Jin Hyun Park,et al.  New gene selection method for classification of cancer subtypes considering within‐class variation , 2003, FEBS letters.

[13]  T. Poggio,et al.  Prediction of central nervous system embryonal tumour outcome based on gene expression , 2002, Nature.

[14]  Zexuan Zhu,et al.  Wrapper–Filter Feature Selection Algorithm Using a Memetic Framework , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[15]  J. Welsh,et al.  Molecular classification of human carcinomas by use of gene expression signatures. , 2001, Cancer research.

[16]  Xiaobo Li,et al.  Comparison of feature selection methods for multiclass cancer classification based on microarray data , 2011, 2011 4th International Conference on Biomedical Engineering and Informatics (BMEI).

[17]  Daniel Q. Naiman,et al.  Simple decision rules for classifying human cancers from gene expression profiles , 2005, Bioinform..

[18]  In-Beum Lee,et al.  New gene selection for classification of cancer subtype considering within-class variation , 2003 .

[19]  Christian A. Rees,et al.  Systematic variation in gene expression patterns in human cancer cell lines , 2000, Nature Genetics.

[20]  Khaled Mellouli,et al.  Hybridization of Genetic and Quantum Algorithm for gene selection and classification of Microarray data , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[21]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[22]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[23]  Rajni Bala,et al.  A Hybrid Approach for Selection of Relevant Features for Microarray Datasets , 2007 .

[24]  Chris H. Q. Ding,et al.  Minimum Redundancy Feature Selection from Microarray Gene Expression Data , 2005, J. Bioinform. Comput. Biol..

[25]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[26]  T. Golub,et al.  Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. , 2003, Cancer research.

[27]  Manju Sardana,et al.  A Comparative Study of Clustering Methods for Relevant Gene Selection in Microarray Data , 2012 .

[28]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[29]  Gexiang Zhang,et al.  Parameter Setting of Quantum-Inspired Genetic Algorithm Based on Real Observation , 2007, RSKT.

[30]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[31]  Jong-Hwan Kim,et al.  Genetic quantum algorithm and its application to combinatorial optimization problem , 2000, Proceedings of the 2000 Congress on Evolutionary Computation. CEC00 (Cat. No.00TH8512).