A probabilistic multi-class strategy of one-vs.-rest support vector machines for cancer classification

Support vector machines (SVMs), originally designed for binary classification, have been applied for multi-class classification with effective decomposition and reconstruction schemes. Decomposition schemes such as one-vs.-rest (OVR) and pair-wise partition a dataset into several subsets of two classes so as to produce multiple outputs that should be combined. Majority voting or winner-takes-all is a representative reconstruction scheme to combine those outputs, but it often causes some problems to consider tie-breaks and tune the weights of individual classifiers. In this paper, we propose a novel method in which SVMs are generated with the OVR scheme and probabilistically ordered by using the naive Bayes classifiers (NBs). This method is able to break ties that frequently occur when working with multi-class classification systems with OVR SVMs. More specifically, we use the Pearson correlation to select informative genes and reduce the dimensionality of gene expression profiles when constructing the NBs. The proposed method has been validated on several popular multi-class cancer datasets and produced higher accuracy than conventional methods.

[1]  Sayan Mukherjee,et al.  Molecular classification of multiple tumor types , 2001, ISMB.

[2]  Ethem Alpaydin,et al.  Support Vector Machines for Multi-class Classification , 1999, IWANN.

[3]  Patrick Tan,et al.  Genetic algorithms applied to multi-class prediction for the analysis of gene expression data , 2003, Bioinform..

[4]  Andreu Català,et al.  K-SVCR. A support vector machine for multi-class classification , 2003, Neurocomputing.

[5]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[6]  Ludmila I. Kuncheva,et al.  A Theoretical Study on Six Classifier Fusion Strategies , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[8]  Rodney A. Brooks,et al.  A Robust Layered Control Syste For A Mobile Robot , 2022 .

[9]  Sung-Bae Cho,et al.  Classifying gene expression data of cancer using classifier ensemble with mutually exclusive features , 2002, Proc. IEEE.

[10]  Constantin F. Aliferis,et al.  A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis , 2004, Bioinform..

[11]  Xuefeng Bruce Ling,et al.  Multiclass cancer classification and biomarker discovery using GA-based algorithms , 2005, Bioinform..

[12]  J. M. Deutsch,et al.  Evolutionary algorithms for finding optimal gene sets in microarray prediction , 2003, Bioinform..

[13]  Koby Crammer,et al.  On the Learnability and Design of Output Codes for Multiclass Problems , 2002, Machine Learning.

[14]  Johan A. K. Suykens,et al.  Multiclass LS-SVMs: Moderated Outputs and Coding-Decoding Schemes , 2002, Neural Processing Letters.

[15]  Kristin P. Bennett,et al.  Multicategory Classification by Support Vector Machines , 1999, Comput. Optim. Appl..

[16]  E. Lander,et al.  MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia , 2002, Nature Genetics.

[17]  Insuk Sohn,et al.  Structured polychotomous machine diagnosis of multiple cancer types using gene expression , 2006, Bioinform..

[18]  T. Poggio,et al.  Prediction of central nervous system embryonal tumour outcome based on gene expression , 2002, Nature.

[19]  James A. Bucklew,et al.  Support vector machines and the multiple hypothesis test problem , 2001, IEEE Trans. Signal Process..

[20]  Daniel Q. Naiman,et al.  Simple decision rules for classifying human cancers from gene expression profiles , 2005, Bioinform..

[21]  Tao Li,et al.  A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression , 2004, Bioinform..

[22]  T. Golub,et al.  Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. , 2003, Cancer research.

[23]  Pedro Larrañaga,et al.  Feature subset selection by genetic algorithms and estimation of distribution algorithms - A case study in the survival of cirrhotic patients treated with TIPS , 2001, Artif. Intell. Medicine.

[24]  T. Poggio,et al.  Multiclass cancer diagnosis using tumor gene expression signatures , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Tharam S. Dillon,et al.  An improved naive Bayesian classifier technique coupled with a novel input solution method [rainfall prediction] , 2001, IEEE Trans. Syst. Man Cybern. Syst..

[26]  Yoonkyung Lee,et al.  Classification of Multiple Cancer Types by Multicategory Support Vector Machines Using Gene Expression Data , 2003, Bioinform..