Large Scale Transductive SVMs

We show how the concave-convex procedure can be applied to transductive SVMs, which traditionally require solving a combinatorial search problem. This provides for the first time a highly scalable algorithm in the nonlinear case. Detailed experiments verify the utility of our approach. Software is available at http://www.kyb.tuebingen.mpg.de/bs/people/fabee/transduction.html .

[1]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[2]  L. Thi,et al.  Analyse numérique des algorithmes de l'optimisation D. C. . Approches locale et globale. Codes et simulations numériques en grande dimension. Applications , 1994 .

[3]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[4]  Ayhan Demiriz,et al.  Semi-Supervised Support Vector Machines , 1998, NIPS.

[5]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[6]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[7]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[8]  Klaus Obermayer,et al.  Bayesian Transduction , 1999, NIPS.

[9]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[10]  Tommi S. Jaakkola,et al.  Partially labeled classification with Markov random walks , 2001, NIPS.

[11]  Alan L. Yuille,et al.  The Concave-Convex Procedure (CCCP) , 2001, NIPS.

[12]  O. Mangasarian,et al.  Semi-Supervised Support Vector Machines for Unlabeled Data Classification , 2001 .

[13]  Mikhail Belkin,et al.  Using manifold structure for partially labelled classification , 2002, NIPS 2002.

[14]  Bernhard Schölkopf,et al.  Cluster Kernels for Semi-Supervised Learning , 2002, NIPS.

[15]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[16]  Nello Cristianini,et al.  Convex Methods for Transduction , 2003, NIPS.

[17]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[18]  Jason Weston,et al.  Semi-supervised Protein Classification Using Cluster Kernels , 2003, NIPS.

[19]  W. Wong,et al.  On ψ-Learning , 2003 .

[20]  Chih-Jen Lin,et al.  Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel , 2003, Neural Computation.

[21]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[22]  Dale Schuurmans,et al.  Maximum Margin Clustering , 2004, NIPS.

[23]  Neil D. Lawrence,et al.  Semi-supervised Learning via Gaussian Processes , 2004, NIPS.

[24]  Yoram Singer,et al.  Leveraging the margin more carefully , 2004, ICML.

[25]  Peter L. Bartlett,et al.  Improved Generalization Through Explicit Optimization of Margins , 2000, Machine Learning.

[26]  Ran El-Yaniv,et al.  Explicit Learning Curves for Transduction and Application to Clustering and Compression Algorithms , 2004, J. Artif. Intell. Res..

[27]  Jason Weston,et al.  Fast Kernel Classifiers with Online and Active Learning , 2005, J. Mach. Learn. Res..

[28]  Thomas Hofmann,et al.  Kernel Methods for Missing Variables , 2005, AISTATS.

[29]  S. Sathiya Keerthi,et al.  A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs , 2005, J. Mach. Learn. Res..

[30]  Alexander Zien,et al.  Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.

[31]  Mikhail Belkin,et al.  Beyond the point cloud: from transductive to semi-supervised learning , 2005, ICML.

[32]  Jason Weston,et al.  Trading convexity for scalability , 2006, ICML.

[33]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .