Morozov, Ivanov and Tikhonov Regularization Based LS-SVMs

This paper contrasts three related regularization schemes for kernel machines using a least squares criterion, namely Tikhonov and Ivanov regularization and Morozov’s discrepancy principle. We derive the conditions for optimality in a least squares support vector machine context (LS-SVMs) where they differ in the role of the regularization parameter. In particular, the Ivanov and Morozov scheme express the trade-off between data-fitting and smoothness in the trust region of the parameters and the noise level respectively which both can be transformed uniquely to an appropriate regularization constant for a standard LS-SVM. This insight is employed to tune automatically the regularization constant in an LS-SVM framework based on the estimated noise level, which can be obtained by using a nonparametric technique as e.g. the differogram estimator.

[1]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[2]  V. A. Morozov,et al.  Methods for Solving Incorrectly Posed Problems , 1984 .

[3]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[4]  R. Shah,et al.  Least Squares Support Vector Machines , 2022 .

[5]  Arnold Neumaier,et al.  Solving Ill-Conditioned and Singular Linear Systems: A Tutorial on Regularization , 1998, SIAM Rev..

[6]  Johan A. K. Suykens,et al.  Variogram based noise variance estimation and its use in kernel based regression , 2003, 2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718).

[7]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[8]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[9]  L. Galway Spline Models for Observational Data , 1991 .

[10]  Colin L. Mallows,et al.  Some Comments on Cp , 2000, Technometrics.

[11]  Yunqian Ma,et al.  Practical selection of SVM parameters and noise estimation for SVM regression , 2004, Neural Networks.

[12]  Johan A. K. Suykens,et al.  Additive regularization: fusion of training and validation levels in kernel methods , 2003 .

[13]  Gene H. Golub,et al.  Matrix computations , 1983 .

[14]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[15]  V. Ivanov,et al.  The Theory of Approximate Methods and Their Application to the Numerical Solution of Singular Integr , 1978 .

[16]  Johan A. K. Suykens,et al.  The differogram: Non-parametric noise variance estimation and its use for model selection , 2005, Neurocomputing.

[17]  Alexander J. Smola,et al.  Learning with kernels , 1998 .