SimpleNPKL: simple non-parametric kernel learning

Previous studies of Non-Parametric Kernel (NPK) learning usually reduce to solving some Semi-Definite Programming (SDP) problem by a standard SDP solver. However, time complexity of standard interior-point SDP solvers could be as high as O(n6.5). Such intensive computation cost prohibits NPK learning applicable to real applications, even for data sets of moderate size. In this paper, we propose an efficient approach to NPK learning from side information, referred to as SimpleNPKL, which can efficiently learn non-parametric kernels from large sets of pairwise constraints. In particular, we show that the proposed SimpleNPKL with linear loss has a closed-form solution that can be simply computed by the Lanczos algorithm. Moreover, we show that the SimpleNPKL with square hinge loss can be re-formulated as a saddle-point optimization task, which can be further solved by a fast iterative algorithm. In contrast to the previous approaches, our empirical results show that our new technique achieves the same accuracy, but is significantly more efficient and scalable.

[1]  John E. Mitchell,et al.  A unifying framework for several cutting plane methods for semidefinite programming , 2006, Optim. Methods Softw..

[2]  Michael L. Overton,et al.  Complementarity and nondegeneracy in semidefinite programming , 1997, Math. Program..

[3]  Ivor W. Tsang,et al.  Learning with Idealized Kernels , 2003, ICML.

[4]  N. Cristianini,et al.  On Kernel-Target Alignment , 2001, NIPS.

[5]  John C. Platt,et al.  Fast Low-Rank Semidefinite Programming for Embedding and Clustering , 2007, AISTATS.

[6]  Ben Taskar,et al.  Structured Prediction, Dual Extragradient and Bregman Projections , 2006, J. Mach. Learn. Res..

[7]  Rong Jin,et al.  Learning nonparametric kernel matrices from pairwise constraints , 2007, ICML '07.

[8]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[9]  Stephen P. Boyd,et al.  Least-Squares Covariance Matrix Adjustment , 2005, SIAM J. Matrix Anal. Appl..

[10]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[11]  Gábor Pataki,et al.  On the Rank of Extreme Matrices in Semidefinite Programs and the Multiplicity of Optimal Eigenvalues , 1998, Math. Oper. Res..

[12]  Chao Yang,et al.  ARPACK users' guide - solution of large-scale eigenvalue problems with implicitly restarted Arnoldi methods , 1998, Software, environments, tools.

[13]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[14]  Kilian Q. Weinberger,et al.  Fast solvers and efficient implementations for distance metric learning , 2008, ICML '08.

[15]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[16]  Rong Jin,et al.  Active kernel learning , 2008, ICML '08.

[17]  Zaïd Harchaoui,et al.  DIFFRAC: a discriminative and flexible framework for clustering , 2007, NIPS.

[18]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[19]  Tong Zhang,et al.  Analysis of Spectral Kernel Design based Semi-supervised Learning , 2005, NIPS.

[20]  Bernhard Schölkopf,et al.  Cluster Kernels for Semi-Supervised Learning , 2002, NIPS.

[21]  Zoubin Ghahramani,et al.  Nonparametric Transforms of Graph Kernels for Semi-Supervised Learning , 2004, NIPS.

[22]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[23]  Mikhail Belkin,et al.  Beyond the point cloud: from transductive to semi-supervised learning , 2005, ICML.

[24]  Inderjit S. Dhillon,et al.  Learning low-rank kernel matrices , 2006, ICML.

[25]  Edward Y. Chang,et al.  Learning the unified kernel machines for classification , 2006, KDD '06.