Generalizing the Bias Term of Support Vector Machines

Based on the study of a generalized form of representer theorem and a specific trick in constructing kernels, a generic learning model is proposed and applied to support vector machines. An algorithm is obtained which naturally generalizes the bias term of SVM. Unlike the solution of standard SVM which consists of a linear expansion of kernel functions and a bias term, the generalized algorithm maps predefined features onto a Hilbert space as well and takes them into special consideration by leaving part of the space unregularized when seeking a solution in the space. Empirical evaluations have confirmed the effectiveness from the generalization in classification tasks.

[1]  T. Poggio,et al.  The Mathematics of Learning: Dealing with Data , 2005, 2005 International Conference on Neural Networks and Brain.

[2]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[3]  C. Micchelli Interpolation of scattered data: Distance matrices and conditionally positive definite functions , 1986 .

[4]  M. Bertero,et al.  Linear inverse problems with discrete data: II. Stability and regularisation , 1988 .

[5]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[6]  Will Light,et al.  Spaces of distributions, interpolation by translates of a basis function and error estimates , 1999, Numerische Mathematik.

[7]  Lorenzo Rosasco,et al.  Learning from Examples as an Inverse Problem , 2005, J. Mach. Learn. Res..

[8]  C. Atkinson METHODS FOR SOLVING INCORRECTLY POSED PROBLEMS , 1985 .

[9]  Federico Girosi,et al.  An Equivalence Between Sparse Approximation and Support Vector Machines , 1998, Neural Computation.

[10]  Tomaso A. Poggio,et al.  Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..

[11]  Tomaso Poggio,et al.  Everything old is new again: a fresh look at historical approaches in machine learning , 2002 .

[12]  Howard A. Levine Review: A. N. Tikhonov and V. Y. Arsenin, Solutions of ill posed problems , 1979 .

[13]  C. Micchelli Interpolation of scattered data: Distance matrices and conditionally positive definite functions , 1986 .

[14]  T Poggio,et al.  Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks , 1990, Science.

[15]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[16]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[17]  Bernhard Schölkopf,et al.  Semiparametric Support Vector and Linear Programming Machines , 1998, NIPS.

[18]  A. N. Tikhonov,et al.  Solutions of ill-posed problems , 1977 .

[19]  Richard K. Beatson,et al.  Fast Solution of the Radial Basis Function Interpolation Equations: Domain Decomposition Methods , 2000, SIAM J. Sci. Comput..

[20]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[21]  Grace Wahba,et al.  Spline Models for Observational Data , 1990 .