Laplacian Embedded Regression for Scalable Manifold Regularization

Semi-supervised learning (SSL), as a powerful tool to learn from a limited number of labeled data and a large number of unlabeled data, has been attracting increasing attention in the machine learning community. In particular, the manifold regularization framework has laid solid theoretical foundations for a large family of SSL algorithms, such as Laplacian support vector machine (LapSVM) and Laplacian regularized least squares (LapRLS). However, most of these algorithms are limited to small scale problems due to the high computational cost of the matrix inversion operation involved in the optimization problem. In this paper, we propose a novel framework called Laplacian embedded regression by introducing an intermediate decision variable into the manifold regularization framework. By using ∈-insensitive loss, we obtain the Laplacian embedded support vector regression (LapESVR) algorithm, which inherits the sparse solution from SVR. Also, we derive Laplacian embedded RLS (LapERLS) corresponding to RLS under the proposed framework. Both LapESVR and LapERLS posses a simpler form of a transformed kernel, which is the summation of the original kernel and a graph kernel that captures the manifold structure. The benefits of the transformed kernel are two-fold: (1) we can deal with the original kernel matrix and the graph Laplacian matrix in the graph kernel separately and (2) if the graph Laplacian matrix is sparse, we only need to perform the inverse operation for a sparse matrix, which is much more efficient when compared with that for a dense one. Inspired by kernel principal component analysis, we further propose to project the introduced decision variable into a subspace spanned by a few eigenvectors of the graph Laplacian matrix in order to better reflect the data manifold, as well as accelerate the calculation of the graph kernel, allowing our methods to efficiently and effectively cope with large scale SSL problems. Extensive experiments on both toy and real world data sets show the effectiveness and scalability of the proposed framework.

[1]  Alexander J. Smola,et al.  Kernels and Regularization on Graphs , 2003, COLT.

[2]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[3]  Ivor W. Tsang,et al.  Improved Nyström low-rank approximation and error analysis , 2008, ICML '08.

[4]  Ayhan Demiriz,et al.  Semi-Supervised Support Vector Machines , 1998, NIPS.

[5]  James T. Kwok,et al.  Prototype vector machine for large scale semi-supervised learning , 2009, ICML '09.

[6]  Ivor W. Tsang,et al.  Flexible Manifold Embedding: A Framework for Semi-Supervised and Unsupervised Dimension Reduction , 2010, IEEE Transactions on Image Processing.

[7]  Matthias W. Seeger,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[8]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..

[9]  James T. Kwok,et al.  Clustered Nyström Method for Large Scale Manifold Learning and Dimension Reduction , 2010, IEEE Transactions on Neural Networks.

[10]  Antonio Torralba,et al.  Semi-Supervised Learning in Gigantic Image Collections , 2009, NIPS.

[11]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[12]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[13]  Thorsten Joachims,et al.  Transductive Learning via Spectral Graph Partitioning , 2003, ICML.

[14]  Geoffrey E. Hinton,et al.  Modeling the manifolds of images of handwritten digits , 1997, IEEE Trans. Neural Networks.

[15]  Mikhail Belkin,et al.  Regularization and Semi-supervised Learning on Large Graphs , 2004, COLT.

[16]  Chao Yang,et al.  ARPACK users' guide - solution of large-scale eigenvalue problems with implicitly restarted Arnoldi methods , 1998, Software, environments, tools.

[17]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[18]  Qiang Yang,et al.  Adaptive Localization in a Dynamic WiFi Environment through Multi-view Learning , 2007, AAAI.

[19]  Genevieve Gorrell,et al.  Generalized Hebbian Algorithm for Incremental Singular Value Decomposition in Natural Language Processing , 2006, EACL.

[20]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[21]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[22]  Mikhail Belkin,et al.  Laplacian Support Vector Machines Trained in the Primal , 2009, J. Mach. Learn. Res..

[23]  Yi Yang,et al.  A Multimedia Retrieval Framework Based on Semi-Supervised Ranking and Relevance Feedback , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[25]  Wei Liu,et al.  Large Graph Construction for Scalable Semi-Supervised Learning , 2010, ICML.

[27]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[28]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[29]  Bernhard Schölkopf,et al.  Cluster Kernels for Semi-Supervised Learning , 2002, NIPS.

[30]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[31]  Zenglin Xu,et al.  Discriminative Semi-Supervised Feature Selection Via Manifold Regularization , 2009, IEEE Transactions on Neural Networks.

[32]  Mikhail Belkin,et al.  Semi-supervised Learning by Higher Order Regularization , 2011, AISTATS.

[33]  Joachim M. Buhmann,et al.  Denoising and Dimension Reduction in Feature Space , 2006, NIPS.

[34]  Ameet Talwalkar,et al.  Large-scale manifold learning , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[36]  Nathan Srebro,et al.  Statistical Analysis of Semi-Supervised Learning: The Limit of Infinite Unlabelled Data , 2009, NIPS.

[37]  Mikhail Belkin,et al.  Tikhonov regularization and semi-supervised learning on large graphs , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[38]  Mikhail Belkin,et al.  Linear Manifold Regularization for Large Scale Semi-supervised Learning , 2005 .

[39]  Ivor W. Tsang,et al.  Large-Scale Sparsified Manifold Regularization , 2006, NIPS.

[40]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[41]  Ivor W. Tsang,et al.  Large-Scale Maximum Margin Discriminant Analysis Using Core Vector Machines , 2008, IEEE Transactions on Neural Networks.

[42]  Jacek M. Zurada,et al.  Generalized Core Vector Machines , 2006, IEEE Transactions on Neural Networks.

[43]  Dong Xu,et al.  Semi-Supervised Bilinear Subspace Learning , 2009, IEEE Transactions on Image Processing.

[44]  Alexander Zien,et al.  Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.

[45]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[46]  Jason Weston,et al.  Large Scale Transductive SVMs , 2006, J. Mach. Learn. Res..

[47]  Dong Xu,et al.  Semi-Supervised Dimension Reduction Using Trace Ratio Criterion , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[48]  Henry Tirri,et al.  A Probabilistic Approach to WLAN User Location Estimation , 2002, Int. J. Wirel. Inf. Networks.

[49]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[50]  Alain Biem,et al.  Semisupervised Least Squares Support Vector Machine , 2009, IEEE Transactions on Neural Networks.