Support Vector Machines, Kernel Logistic Regression and Boosting

The support vector machine is known for its excellent performance in binary classification, i.e., the response y ? {-1, 1}, but its appropriate extension to the multi-class case is still an on-going research issue. Another weakness of the SVM is that it only estimates sign[p(x) - 1/2], while the probability p(x) is often of interest itself, where p(x) = P(Y = 1|X = x) is the conditional probability of a point being in class 1 given X = x. We propose a new approach for classification, called the import vector machine, which is built on kernel logistic regression (KLR). We show on some examples that the IVM performs as well as the SVM in binary classification. The IVM can naturally be generalized to the multi-class case. Furthermore, the IVM provides an estimate of the underlying class probabilities. Similar to the "support points" of the SVM, the IVM model uses only a fraction of the training data to index kernel basis functions, typically a much smaller fraction than the SVM. This can give the IVM a computational advantage over the SVM, especially when the size of the training data set is large. We illustrate these techniques on some examples, and make connections with boosting, another popular machine-learning method for classification.

[1]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[2]  Anne Lohrli Chapman and Hall , 1985 .

[3]  Matthias W. Seeger,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[4]  David H. Wolpert,et al.  The Mathematics of Generalization: The Proceedings of the SFI/CNLS Workshop on Formal Approaches to Supervised Learning , 1994 .

[5]  G. Wahba,et al.  Some results on Tchebycheffian spline functions , 1971 .

[6]  Bernhard Schölkopf,et al.  Sparse Greedy Matrix Approximation for Machine Learning , 2000, International Conference on Machine Learning.

[7]  B. Silverman,et al.  Nonparametric regression and generalized linear models , 1994 .

[8]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[9]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[10]  Xiwu Lin,et al.  Smoothing spline ANOVA models for large data sets with Bernoulli observations and the randomized GACV , 2000 .

[11]  Tomaso A. Poggio,et al.  Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..

[12]  Mark A. Satterthwaite,et al.  The Bayesian theory of the k-double auction: Santa Fe Institute Studies in the Sciences of Complexity , 2018 .

[13]  Hansong Zhang,et al.  Gacv for support vector machines , 2000 .

[14]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[15]  Bernhard Schölkopf,et al.  GACV for Support Vector Machines , 2000 .