A Novel Relative Space Based Gene Feature Extraction and Cancer Recognition

Recognizing patient samples with gene expression profiles is used to cancer diagnosis and therapy. In the high dimensional, huge redundant and noisy gene expression data the cancerogenic factor's locality is studied. Using gene feature transformation a relative space to a cancer is built and a least spread space with least energy to the cancer is extracted. And it is proven that the cancer is able to be recognized in the least spread space and a cancer classification with least spread space (CCLSS) is proposed. In the Leukemia dataset and Colon dataset the correlation between the recognition rate and the rank of least spread space is explored, then the optimal least spread spaces to AML/ALL and to tumor colon tissue (TCT)/normal colon tissue (NCT) are extracted. At last using LOOCV the experiments with different classification algorithms are conducted and the results show CCLSS makes better precision than traditional classification algorithms.

[1]  Li Yingxin and Ruan Xiaogang,et al.  Feature Selection for Cancer Classification Based on Support Vector Machine , 2005 .

[2]  Huan Liu,et al.  Subspace clustering for high dimensional data: a review , 2004, SKDD.

[3]  Nikola Kasabov,et al.  Evolving Connectionist Systems: Methods and Applications in Bioinformatics, Brain Study and Intelligent Machines , 2002, IEEE Transactions on Neural Networks.

[4]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[5]  Joaquín Dopazo,et al.  Unsupervised reduction of the dimensionality followed by supervised learning with a perceptron improves the classification of conditions in DNA microarray gene expression data , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[6]  Ryszard Tadeusiewicz,et al.  Artificial Intelligence and Soft Computing - ICAISC 2006, 8th International Conference, Zakopane, Poland, June 25-29, 2006, Proceedings , 2006, International Conference on Artificial Intelligence and Soft Computing.

[7]  Yaping Lin,et al.  Using Most Similarity Tree Based Clustering to Select the Top Most Discriminating Genes for Cancer Detection , 2006, ICAISC.

[8]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[9]  Bart Kosko,et al.  Neural networks for signal processing , 1992 .

[10]  Joshua M. Stuart,et al.  MICROARRAY EXPERIMENTS : APPLICATION TO SPORULATION TIME SERIES , 1999 .

[11]  Sung-Bae Cho,et al.  Machine Learning in DNA Microarray Analysis for Cancer Classification , 2003, APBC.

[12]  George Karypis,et al.  Gene Classification Using Expression Profiles: A Feasibility Study , 2005, Int. J. Artif. Intell. Tools.

[13]  Todd,et al.  Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning , 2002, Nature Medicine.