Semi-Supervised Learning Using Semi-Definite Programming

We discuss the problem of support vector machine (SVM) transduction, which is a combinatorial problem with exponential computational complexity in the number of unlabeled samples. Different approaches to such combinatorial problems exist, among which are exact integer programming approaches (only feasible for very small sample sizes, e.g. [1]) and local search heuristics starting from a suitably chosen start value such as the approach explained in Chapter 5, Transductive Support Vector Machines , and introduced in [2] (scalable to large problem sizes, but sensitive to local optima). In this chapter, we discuss an alternative approach introduced in [3], which is based on a convex relaxation of the optimization problem associated to support vector machine transduction. The result is a semi-definite programming (SDP) problem which can be optimized in polynomial time, the solution of which is an approximation of the optimal labeling as well as a bound on the true optimum of the original transduction objective function. To further decrease the computational complexity, we propose an approximation that allows to solve transduction problems of up to 1000 unlabeled samples. Lastly, we extend the formulation to more general settings of semi-supervised learning, where equivalence and inequivalence constraints are given on labels of some of the samples.