A Probabilistic Neural Network for Gene Selection and Classification of Microarray Data

In this paper, we present the mathematical foundations of a probabilistic neural network for gene selection and classification of high-dimensional microarray data. We present a catalogue of features that a classification system for microarray data should incorporate. We then use this catalogue and compare the theoretical properties of probabilistic neural networks with support vector machines with regard to their suitability for multiclass cancer prediction. We compare the classification performance of a probabilistic neural network with the performance of a support vector machine on a multiclass microarray data set. The results of the theoretical and practical comparison suggest that the probabilistic neural network approach is to be preferred over support vector machines for multiclass cancer classification using microarray data.

[1]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[3]  Timothy Masters,et al.  Advanced algorithms for neural networks: a C++ sourcebook , 1995 .

[4]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[5]  Noam Harpaz,et al.  Artificial neural networks distinguish among subtypes of neoplastic colorectal lesions. , 2002, Gastroenterology.

[6]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[7]  D. Botstein,et al.  A gene expression database for the molecular pharmacology of cancer , 2000, Nature Genetics.

[8]  Todd,et al.  Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning , 2002, Nature Medicine.

[9]  Walter L. Ruzzo,et al.  Improved Gene Selection for Classification of Microarrays , 2002, Pacific Symposium on Biocomputing.

[10]  M. Ringnér,et al.  Molecular classification of familial non-BRCA1/BRCA2 breast cancer , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[11]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[12]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[13]  Sayan Mukherjee,et al.  Classifying Microarray Data Using Support Vector Machines , 2003 .

[14]  Werner Dubitzky,et al.  Multiclass Cancer Classification Using Gene Expression Profiling and Probabilistic Neural Networks , 2002, Pacific Symposium on Biocomputing.

[15]  D. Slonim From patterns to pathways: gene expression data analysis comes of age , 2002, Nature Genetics.

[16]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[17]  Sayan Mukherjee,et al.  Molecular classification of multiple tumor types , 2001, ISMB.

[18]  Nello Cristianini,et al.  Large Margin DAGs for Multiclass Classification , 1999, NIPS.

[19]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[20]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[21]  R. Spang,et al.  Predicting the clinical status of human breast cancer by using gene expression profiles , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Jill P. Mesirov,et al.  Support Vector Machine Classification of Microarray Data , 2001 .

[23]  T. Poggio,et al.  Multiclass cancer diagnosis using tumor gene expression signatures , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[24]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .