A comparative study of feature selection methods for probabilistic neural networks in cancer classification

Accurate diagnosis and classification is the key issue for the optimal treatment of cancer patients. Several studies demonstrate that cancer classification can be estimated with high accuracy, sensitivity and specificity from microarray-based gene expression profiling using artificial neural networks. In this paper, a comprehensive study was undertaken to investigate the capability of the probabilistic neural networks (PNN) associated with a feature selection method, a so-called signal-to-noise statistic, in the application of cancer classification. The signal-to-noise statistic, which represents the correlation with the class distinction, is used to select the marker genes and trim the dimension of data samples for the PNN. The experimental results show that the association of the probabilistic neural network with the signal-to-noise statistic can achieve superior classification results for two types of acute leukemias and five categories of embryonal tumors of central nervous system with satisfactory computation speed. Furthermore, the signal-to-noise statistic analysis provides candidate genes for future study in understanding the disease process and the identification of potential targets for therapeutic intervention.

[1]  Nir Friedman,et al.  Tissue classification with gene expression profiles. , 2000 .

[2]  M. Basu,et al.  Application of neural network to gene expression data for cancer classification , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[3]  Francisco Azuaje,et al.  Making genome expression data meaningful: prediction and discovery of classes of cancer through a connectionist learning approach , 2000, Proceedings IEEE International Symposium on Bio-Informatics and Biomedical Engineering.

[4]  Geoffrey J McLachlan,et al.  Selection bias in gene extraction on the basis of microarray gene-expression data , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[5]  George Karypis,et al.  Gene Classification Using Expression Profiles: A Feasibility Study , 2005, Int. J. Artif. Intell. Tools.

[6]  Sung-Bae Cho,et al.  Gene expression classification using optimal feature/classifier ensemble with negative correlation , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[7]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[8]  Jill P. Mesirov,et al.  Class prediction and discovery using gene expression data , 2000, RECOMB '00.

[9]  A Y Yakovlev,et al.  Variable selection and pattern recognition with gene expression data generated by the microarray technology. , 2002, Mathematical biosciences.

[10]  Carlos S. Moreno,et al.  Expression microarray analysis of brain tumors: what have we learned so far. , 2002 .

[11]  T. Poggio,et al.  Prediction of central nervous system embryonal tumour outcome based on gene expression , 2002, Nature.

[12]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[13]  Georgios C. Anagnostopoulos,et al.  Tissue classification through analysis of gene expression data using a new family of ART architectures , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[14]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[15]  Aniko Szabo,et al.  Identification of gene expression profiles that segregate patients with childhood leukemia. , 2002, Clinical cancer research : an official journal of the American Association for Cancer Research.

[16]  M. Su,et al.  Multi-domain gating network for classification of cancer cells using gene expression data , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[17]  F. Azuaje,et al.  Gene expression patterns and cancer classification: a self-adaptive and incremental neural approach , 2000, Proceedings 2000 IEEE EMBS International Conference on Information Technology Applications in Biomedicine. ITAB-ITIS 2000. Joint Meeting Third IEEE EMBS International Conference on Information Technol.

[18]  Geoff Holmes,et al.  Benchmarking Attribute Selection Techniques for Discrete Class Data Mining , 2003, IEEE Trans. Knowl. Data Eng..

[19]  Donald F. Specht,et al.  Probabilistic neural networks and the polynomial Adaline as complementary techniques for classification , 1990, IEEE Trans. Neural Networks.