Microarray gene expression cancer diagnosis using Machine Learning algorithms

In this paper, we use the extreme Learning Machine (ELM) for cancer classification. We propose a two step method. In our two step feature selection method, we first use a gene importance ranking and then, finding the minimum gene subset form the top-ranked genes based on the first step. We tested our two step method in cancer datasets like Lymphoma data set and SRBCT data set. The results in the Lymphoma data set and SRBCT dataset show our two-step methods is able to achieve 100% accuracy with much fewer gene combination than other published results. The results indicate that ELM produces comparable or better classification accuracies with reduced training time and implementation complexity compared to neural networks methods like Back Propagation Networks, SANN and Support Vector Machine methods. ELM also achieves better accuracy for classification of individual categories.

[1]  Feng Chu,et al.  A General Wrapper Approach to Selection of Class-Dependent Features , 2008, IEEE Transactions on Neural Networks.

[2]  Danh V. Nguyen,et al.  Tumor classification by partial least squares using microarray gene expression data , 2002, Bioinform..

[3]  Thomas A. Darden,et al.  Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method , 2001, Bioinform..

[4]  Chee Kheong Siew,et al.  Extreme learning machine: RBF network case , 2004, ICARCV 2004 8th Control, Automation, Robotics and Vision Conference, 2004..

[5]  Feng Chu,et al.  Accurate Cancer Classification Using Expressions of Very Few Genes , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[6]  Jae Won Lee,et al.  An extensive comparison of recent classification tools applied to microarray data , 2004, Comput. Stat. Data Anal..

[7]  P. Saratchandran,et al.  Multicategory Classification Using An Extreme Learning Machine for Microarray Gene Expression Cancer Diagnosis , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[8]  I. Jolliffe Principal Component Analysis , 2002 .

[9]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[10]  Shmuel Friedland,et al.  A simultaneous reconstruction of missing data in DNA microarrays , 2006 .

[11]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[12]  Wei Xie,et al.  Accurate Cancer Classification Using Expressions of Very Few Genes , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[13]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..