论文信息 - Multiclass LS-SVMs: Moderated Outputs and Coding-Decoding Schemes

Multiclass LS-SVMs: Moderated Outputs and Coding-Decoding Schemes

A common way of solving the multiclass categorization problem is to reformulate the problem into a set of binary classification problems. Discriminative binary classifiers like, e.g., Support Vector Machines (SVMs), directly optimize the decision boundary with respect to a certain cost function. In a pragmatic and computationally simple approach, Least Squares SVMs (LS-SVMs) are inferred by minimizing a related regression least squares cost function. The moderated outputs of the binary classifiers are obtained in a second step within the evidence framework. In this paper, Bayes' rule is repeatedly applied to infer the posterior multiclass probabilities, using the moderated outputs of the binary plug-in classifiers and the prior multiclass probabilities. This Bayesian decoding motivates the use of loss function based decoding instead of Hamming decoding. For SVMs and LS-SVMs with linear kernel, experimental evidence suggests the use of one-versus-one coding. With a Radial Basis Function kernel one-versus-one and error correcting output codes yield the best performances, but simpler codings may still yield satisfactory results.

[1] Ulrich H.-G. Kreßel,et al. Pairwise classification and support vector machines , 1999 .

[2] Johan A. K. Suykens,et al. Bayesian Framework for Least-Squares Support Vector Machine Classifiers, Gaussian Processes, and Kernel Fisher Discriminant Analysis , 2002, Neural Computation.

[3] Dustin Boswell,et al. Introduction to Support Vector Machines , 2002 .

[4] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.

[5] David Mackay,et al. Probable networks and plausible predictions - a review of practical Bayesian methods for supervised neural networks , 1995 .

[6] G. Baudat,et al. Generalized Discriminant Analysis Using a Kernel Approach , 2000, Neural Computation.

[7] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[8] Johan A. K. Suykens,et al. Financial time series prediction using least squares support vector machines within the evidence framework , 2001, IEEE Trans. Neural Networks.

[9] Richard O. Duda,et al. Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[10] Peter E. Hart,et al. Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[11] B. Scholkopf,et al. Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[12] Alexander Gammerman,et al. Ridge Regression Learning Algorithm in Dual Variables , 1998, ICML.

[13] Nello Cristianini,et al. Large Margin DAGs for Multiclass Classification , 1999, NIPS.

[14] R. Tibshirani,et al. Flexible Discriminant Analysis by Optimal Scoring , 1994 .

[15] Christopher K. I. Williams. Prediction with Gaussian Processes: From Linear Regression to Linear Prediction and Beyond , 1999, Learning in Graphical Models.

[16] Yoram Singer,et al. Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[17] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[18] Johan A. K. Suykens,et al. Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[19] Johan A. K. Suykens,et al. Multiclass least squares support vector machines , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[20] James T. Kwok,et al. The evidence framework applied to support vector machines , 2000, IEEE Trans. Neural Networks Learn. Syst..

[21] Thomas G. Dietterich,et al. Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[22] Terrence J. Sejnowski,et al. Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[23] Nello Cristianini,et al. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[24] James T. Kwok. Moderating the outputs of support vector machine classifiers , 1999, IEEE Trans. Neural Networks.

[25] Wolfgang Utschick,et al. A Regularization Method for Non-Trivial Codes in Polychotomous Classifications , 1998, Int. J. Pattern Recognit. Artif. Intell..