An Efficient Approach for Classification of Gene Expression Microarray Data

Microarrays help in storing gene expression data from a cell. Each microarray describes features of each cell. The rows in microarray represent the samples and the columns represent the gene expression level of the cell. Microarray data is of high dimension due to which classification using conventional methods becomes tedious and inefficient. Therefore, reducing the dimension of long feature vector and extracting relevant features out of it becomes a very challenging task. This can be achieved using various techniques of feature extraction and/or feature selection. Design of an efficient classification model is another crucial task for any classification problem. In this paper, emphasis is given for significant feature extraction as well as efficient design of classifier. The task of microarray classification is done in two phases. In the first phase, a hybrid approach of Genetic Algorithm (GA) and Principal Component Analysis (PCA) is used for extracting relevant features. In the second phase, Probabilistic Neural Network (PNN) is used as the classifier and GA is implemented to optimize the topology of the PNN. The datasets used in the experiment are Colon Tumor, Diffuse Large B-Cell Lymphoma (DLBCL) and Leukaemia (ALL and AML). The proposed technique gave efficient results for the datasets used.

[1]  Hui-Huang Hsu,et al.  Hybrid feature selection by combining filters and wrappers , 2011, Expert Syst. Appl..

[2]  Dong-Ling Tong,et al.  Hybrid genetic algorithm-neural network: Feature extraction for unpreprocessed microarray data , 2011, Artif. Intell. Medicine.

[3]  Zehang Sun,et al.  Object detection using feature subset selection , 2004, Pattern Recognit..

[4]  Carlos J. Alonso,et al.  Microarray gene expression classification with few genes: Criteria to combine attribute selection and classification methods , 2012, Expert Syst. Appl..

[5]  Jack Y. Yang,et al.  Selecting subsets of newly extracted features from PCA and PLS in microarray data analysis , 2008, BMC Genomics.

[6]  Anil K. Jain,et al.  Dimensionality reduction using genetic algorithms , 2000, IEEE Trans. Evol. Comput..

[7]  Santanu Kumar Rath,et al.  Protein superfamily Classification using Adaptive Evolutionary Radial Basis Function Network , 2012, Int. J. Comput. Intell. Appl..

[8]  A. Meystel,et al.  Intelligent Systems , 2001 .

[9]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[10]  Yuh-Min Chen,et al.  Gene selection and sample classification on microarray data based on adaptive genetic algorithm/k-nearest neighbor method , 2011, Expert Syst. Appl..

[11]  Yafei Zhang,et al.  Feature Selection Based on Genetic Algorithm for CBIR , 2008, 2008 Congress on Image and Signal Processing.

[12]  Yanqing Zhang,et al.  Improving Feature Subset Selection Using a Genetic Algorithm for Microarray Gene Expression Data , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[13]  Zne-Jung Lee,et al.  An integrated algorithm for gene selection and classification applied to microarray data of ovarian cancer , 2008, Artif. Intell. Medicine.