In order to deal with known limitations of the hard margin support vector machine (SVM) for binary classification — such as overfitting and the fact that some data sets are not linearly separable —, a soft margin approach has been proposed in literature [2, 4, 5]. The soft margin SVM allows training data to be misclassified to a certain extent, by introducing slack variables and penalizing the cost function with an error term, i.e., the 1-norm or 2-norm of the corresponding slack vector. A regularization parameter C trades off the importance of maximizing the margin versus minimizing the error. While the 2-norm soft margin algorithm itself is well understood, and a generalization bound is known [4, 5], no computationally tractable method for tuning the soft margin parameter C has been proposed so far. In this report we present a convex way to optimize C for the 2-norm soft margin SVM, by maximizing this generalization bound. The resulting problem is a quadratically constrained quadratic programming (QCQP) problem, which can be solved in polynomial time O(l) with l the number of training samples.
[1]
Nello Cristianini,et al.
Margin Distribution and Soft Margin
,
2000
.
[2]
Tong Zhang,et al.
An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods
,
2001,
AI Mag..
[3]
Nello Cristianini,et al.
On the generalization of soft margin algorithms
,
2002,
IEEE Trans. Inf. Theory.
[4]
Nello Cristianini,et al.
Learning the Kernel Matrix with Semidefinite Programming
,
2002,
J. Mach. Learn. Res..
[5]
Corinna Cortes,et al.
Support-Vector Networks
,
1995,
Machine Learning.