Large margin vs. large volume in transductive learning

We consider a large volume principle for transductive learning that prioritizes the transductive equivalence classes according to the volume they occupy in hypothesis space. We approximate volume maximization using a geometric interpretation of the hypothesis space. The resulting algorithm is defined via a non-convex optimization problem that can still be solved exactly and efficiently. We provide a bound on the test error of the algorithm and compare it to transductive SVM (TSVM) using 31 datasets.

[1]  Santosh S. Vempala,et al.  Simulated annealing in convex bodies and an O*(n4) volume algorithm , 2006, J. Comput. Syst. Sci..

[2]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[3]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[4]  Ran El-Yaniv,et al.  Explicit Learning Curves for Transduction and Application to Clustering and Compression Algorithms , 2004, J. Artif. Intell. Res..

[5]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[6]  Steve Hanneke,et al.  An analysis of graph cut size for transductive learning , 2006, ICML.

[7]  W. Gander,et al.  A constrained eigenvalue problem , 1988 .

[8]  Ran El-Yaniv,et al.  Transductive Rademacher Complexity and Its Applications , 2007, COLT.

[9]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[10]  Jason Weston,et al.  Inference with the Universum , 2006, ICML.

[11]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[12]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[13]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[14]  Don Coppersmith,et al.  Matrix multiplication via arithmetic progressions , 1987, STOC.

[15]  Ran El-Yaniv,et al.  Transductive Rademacher Complexity and Its Applications , 2007, COLT.

[16]  G. Forsythe,et al.  On the Stationary Values of a Second-Degree Polynomial on the Unit Sphere , 1965 .

[17]  Bernhard Schölkopf,et al.  Introduction to Semi-Supervised Learning , 2006, Semi-Supervised Learning.

[18]  Bernhard Schölkopf,et al.  Transductive Classification via Local Learning Regularization , 2007, AISTATS.

[19]  Jason Weston,et al.  Large Scale Transductive SVMs , 2006, J. Mach. Learn. Res..

[20]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[21]  Klaus Obermayer,et al.  Bayesian Transduction , 1999, NIPS.