Large-Scale Sparsified Manifold Regularization

Semi-supervised learning is more powerful than supervised learning by using both labeled and unlabeled data. In particular, the manifold regularization framework, together with kernel methods, leads to the Laplacian SVM (LapSVM) that has demonstrated state-of-the-art performance. However, the LapSVM solution typically involves kernel expansions of all the labeled and unlabeled examples, and is slow on testing. Moreover, existing semi-supervised learning methods, including the LapSVM, can only handle a small number of unlabeled examples. In this paper, we integrate manifold regularization with the core vector machine, which has been used for large-scale supervised and unsupervised learning. By using a sparsified manifold regularizer and formulating as a center-constrained minimum enclosing ball problem, the proposed method produces sparse solutions with low time and space complexities. Experimental results show that it is much faster than the LapSVM, and can handle a million unlabeled examples on a standard PC; while the LapSVM can only handle several thousand patterns.

[1]  Nicolas Le Roux,et al.  Efficient Non-Parametric Function Induction in Semi-Supervised Learning , 2004, AISTATS.

[2]  Vikas Sindhwani,et al.  The Geometric Basis of Semi-Supervised Learning , 2006, Semi-Supervised Learning.

[3]  Xiaojin Zhu,et al.  Harmonic mixtures: combining mixture models and graph-based methods for inductive and scalable semi-supervised learning , 2005, ICML.

[4]  M. Griebel,et al.  Semi-supervised learning with sparse grids , 2005, ICML 2005.

[5]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[6]  Bernhard Schölkopf,et al.  Introduction to Semi-Supervised Learning , 2006, Semi-Supervised Learning.

[7]  Ivor W. Tsang,et al.  Core Vector Regression for very large regression problems , 2005, ICML.

[8]  Thomas Gärtner,et al.  Large-Scale Multiclass Transduction , 2005, NIPS.

[9]  Mikhail Belkin,et al.  Beyond the point cloud: from transductive to semi-supervised learning , 2005, ICML.

[10]  Yuh-Jye Lee,et al.  RSVM: Reduced Support Vector Machines , 2001, SDM.

[11]  Federico Girosi,et al.  An Equivalence Between Sparse Approximation and Support Vector Machines , 1998, Neural Computation.

[12]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[13]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..

[14]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[15]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .