Enhancing semi-supervised learning through label-aware base kernels

Currently, a large family of kernel design methods for semi-supervised learning (SSL) problems builds the kernel by weighted averaging of predefined base kernels (i.e., those spanned by kernel eigenvectors). Optimization of the base kernel weights has been studied extensively in the literature. However, little attention was devoted to designing high-quality base kernels. The eigenvectors of the kernel matrix, which are computed irrespective of class labels, may not always reveal useful structures of the target. As a result, the generalization performance can be poor however hard the base kernel weighting is tuned. On the other hand, there are many SSL algorithms whose focus are not on kernel design but on the estimation of the class labels directly. Motivated by the label propagation approach, in this paper we propose to construct novel kernel eigenvectors by injecting the class label information under the framework of eigenfunction extrapolation. A set of "label-aware" base kernels can be obtained with greatly improved quality, which leads to higher target alignment and henceforth better performance. Our approach is computationally efficient, and demonstrates encouraging performance in semi-supervised classification and regression tasks.

[1]  Hujun Bao,et al.  A Variance Minimization Criterion to Feature Selection Using Laplacian Regularization , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Qiang Yang,et al.  Estimating Location Using Wi-Fi , 2008, IEEE Intelligent Systems.

[3]  James T. Kwok,et al.  Scaling Up Graph-Based Semisupervised Learning via Prototype Vector Machines , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[5]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[6]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Jason Weston,et al.  Large Scale Transductive SVMs , 2006, J. Mach. Learn. Res..

[8]  Mehryar Mohri,et al.  Two-Stage Learning Kernel Algorithms , 2010, ICML.

[9]  Marius Kloft,et al.  Learning Kernels Using Local Rademacher Complexity , 2013, NIPS.

[10]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[11]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[12]  Inderjit S. Dhillon,et al.  Semi-supervised graph clustering: a kernel approach , 2005, ICML '05.

[13]  Zhi-Hua Zhou,et al.  Towards Making Unlabeled Data Never Hurt , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Mikhail Belkin,et al.  Laplacian Support Vector Machines Trained in the Primal , 2009, J. Mach. Learn. Res..

[15]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[16]  Mikhail Belkin,et al.  Semi-supervised Learning by Higher Order Regularization , 2011, AISTATS.

[17]  Mikhail Belkin,et al.  Semi-Supervised Learning Using Sparse Eigenfunction Bases , 2009, AAAI Fall Symposium: Manifold Learning and Its Applications.

[18]  Ivor W. Tsang,et al.  Learning with Idealized Kernels , 2003, ICML.

[19]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[20]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[21]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[22]  Bernhard Schölkopf,et al.  Cluster Kernels for Semi-Supervised Learning , 2002, NIPS.

[23]  Zoubin Ghahramani,et al.  Nonparametric Transforms of Graph Kernels for Semi-Supervised Learning , 2004, NIPS.

[24]  Jitendra Malik,et al.  Spectral grouping using the Nystrom method , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Nicolas Le Roux,et al.  Learning Eigenfunctions Links Spectral Embedding and Kernel PCA , 2004, Neural Computation.

[26]  John Shawe-Taylor,et al.  The Stability of Kernel Principal Components Analysis and its Relation to the Process Eigenspectrum , 2002, NIPS.

[27]  Alexander Zien,et al.  Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.

[28]  N. Cristianini,et al.  On Kernel-Target Alignment , 2001, NIPS.

[29]  Alexander J. Smola,et al.  Kernels and Regularization on Graphs , 2003, COLT.

[30]  Matthias W. Seeger,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.