Multicategory Classification Using An Extreme Learning Machine for Microarray Gene Expression Cancer Diagnosis

In this paper, the recently developed Extreme Learning Machine (ELM) is used for directing multicategory classification problems in the cancer diagnosis area. ELM avoids problems like local minima, improper learning rate and overfitting commonly faced by iterative learning methods and completes the training very fast. We have evaluated the multicategory classification performance of ELM on three benchmark microarray data sets for cancer diagnosis, namely, the GCM data set, the Lung data set, and the Lymphoma data set. The results indicate that ELM produces comparable or better classification accuracies with reduced training time and implementation complexity compared to artificial neural networks methods like conventional back-propagation ANN, Linder's SANN, and Support Vector Machine methods like SVM-OVO and Ramaswamy's SVM-OVA. ELM also achieves better accuracies for classification of individual categories.

[1]  Guang-Bin Huang,et al.  Learning capability and storage capacity of two-hidden-layer feedforward networks , 2003, IEEE Trans. Neural Networks.

[2]  Steven Salzberg,et al.  On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach , 1997, Data Mining and Knowledge Discovery.

[3]  Siegfried J. Pöppl,et al.  The 'subsequent artificial neural network' (SANN) approach might bring more classificatory power to ANN-based DNA microarray analyses , 2004, Bioinform..

[4]  Tao Li,et al.  A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression , 2004, Bioinform..

[5]  Narasimhan Sundararajan,et al.  Fully complex extreme learning machine , 2005, Neurocomputing.

[6]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[7]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[8]  T. Poggio,et al.  Multiclass cancer diagnosis using tumor gene expression signatures , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[9]  D. Serre Matrices: Theory and Applications , 2002 .

[10]  M. Ringnér,et al.  Analyzing array data using supervised methods. , 2002, Pharmacogenomics.

[11]  GusfieldDan Introduction to the IEEE/ACM Transactions on Computational Biology and Bioinformatics , 2004 .

[12]  C. Siew,et al.  Extreme Learning Machine with Randomly Assigned RBF Kernels , 2005 .

[13]  Chee Kheong Siew,et al.  Universal Approximation using Incremental Constructive Feedforward Networks with Random Hidden Nodes , 2006, IEEE Transactions on Neural Networks.

[14]  Chee Kheong Siew,et al.  Can threshold networks be trained directly? , 2006, IEEE Transactions on Circuits and Systems II: Express Briefs.

[15]  Sayan Mukherjee,et al.  Molecular classification of multiple tumor types , 2001, ISMB.

[16]  Jae Won Lee,et al.  An extensive comparison of recent classification tools applied to microarray data , 2004, Comput. Stat. Data Anal..

[17]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[18]  Constantin F. Aliferis,et al.  A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis , 2004, Bioinform..

[19]  Chee Kheong Siew,et al.  Extreme learning machine: RBF network case , 2004, ICARCV 2004 8th Control, Automation, Robotics and Vision Conference, 2004..

[20]  L SalzbergSteven On Comparing Classifiers , 1997 .