A Novel Two-Stage Cancer Classification Method for Microarray Data Based on Supervised Manifold Learning

Gene expression data analysis is a very useful tool for medical diagnosis. Combined with classification methods, this technology can be used to help make clinical decisions for individual patients. In this paper, a novel classification method for cancer microarray data was proposed. This method includes two stages: The first stage is to select a number of genes based on a gene selection algorithm, and then supervised locality preserving projections (SLPP) is accepted for further dimension reduction and discriminant feature extraction. This stage can find more discriminant projection direction based on training data. The second stage uses nearest neighborhood (NN) and support vector machine (SVM) for classification. To show the validity of the proposed method, 4 real cancer data sets were used for classifying. The prediction performance was evaluated by 3-fold cross validation. The experimental results show that the method presented here is effective and efficient.

[1]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[2]  Tzu-Tsung Wong,et al.  Two-stage classification methods for microarray data , 2008, Expert Syst. Appl..

[3]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[4]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[5]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[6]  Kevin Barraclough,et al.  I and i , 2001, BMJ : British Medical Journal.

[7]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[8]  I K Fodor,et al.  A Survey of Dimension Reduction Techniques , 2002 .

[9]  R. Young,et al.  Biomedical Discovery with DNA Arrays , 2000, Cell.

[10]  Yuxiao Hu,et al.  Face recognition using Laplacianfaces , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Changshui Zhang,et al.  Classification of gene-expression data: The manifold-based metric learning way , 2006, Pattern Recognit..

[12]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[13]  T. Golub,et al.  Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. , 2003, Cancer research.

[14]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[15]  Jiawei Han,et al.  Cancer classification using gene expression data , 2003, Inf. Syst..

[16]  Hongyu Li,et al.  Dimension reduction of microarray data based on local tangent space alignment , 2005, Fourth IEEE Conference on Cognitive Informatics, 2005. (ICCI 2005)..

[17]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[18]  David G. Stork,et al.  Pattern Classification , 1973 .

[19]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[20]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[21]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.