Accurate Cancer Classification Using Expressions of Very Few Genes

Gene expression profiling by microarray technique has been effectively utilized for classification and diagnostic guessing of cancer nodules. Several machine learning and data mining techniques are presently applied for identifying cancer using gene expression data. Though, these techniques have not been proposed to deal with the particular needs of gene microarray examination. Initially, microarray data is featured by a highdimensional feature space repeatedly surpassing the sample space dimensionality by a factor of 100 or higher. Additionally, microarray data contains a high degree of noise. The majority of the existing techniques do not sufficiently deal with the drawbacks like dimensionality and noise. Gene ranking method is later introduced to overcome those problems. Some of the widely used Gene ranking techniques are T-Score, ANOVA, etc. But those techniques will sometimes wrongly predict the rank when large database is used. To overcome these issues, this paper proposes a technique called Enrichment Score for ranking purpose. The classifier used in the proposed technique is Support Vector Machine (SVM). The experiment is performed on lymphoma data set and the result shows the better accuracy of classification when compared to the conventional method.

[1]  Xue-wen Chen,et al.  Optimized Kernel Machines for Cancer Classification Using Gene Expression Data , 2005, 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[2]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[3]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[4]  Hui Li,et al.  A Method for Cancer Classification Using Ensemble Neural Networks with Gene Expression Profile , 2008, 2008 2nd International Conference on Bioinformatics and Biomedical Engineering.

[5]  Irina Simonovska,et al.  About the Authors , 1998 .

[6]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[7]  M. Xiong,et al.  Recursive partitioning for tumor classification with gene expression microarray data , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[8]  T. Poggio,et al.  Multiclass cancer diagnosis using tumor gene expression signatures , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Anton Berns,et al.  Cancer: Gene expression in diagnosis , 2000, Nature.

[10]  Werner Dubitzky,et al.  Comparing Symbolic and Subsymbolic Machine Learning Approaches to Classification of Cancer and Gene Identification , 2002 .

[11]  D. Wunsch,et al.  Multiclass Cancer Classification Using Semisupervised Ellipsoid ARTMAP and Particle Swarm Optimization with Gene Expression Data , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[12]  P. Saratchandran,et al.  Multicategory Classification Using An Extreme Learning Machine for Microarray Gene Expression Cancer Diagnosis , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[13]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[14]  Xiyi Hang,et al.  Cancer classification by sparse representation using microarray gene expression data , 2008, 2008 IEEE International Conference on Bioinformatics and Biomeidcine Workshops.