Optimal regression rates for SVMs using Gaussian kernels

Support vector machines (SVMs) using Gaussian kernels are one of the standard and state-of-the-art learning algorithms. In this work, we establish new oracle inequalities for such SVMs when applied to either least squares or conditional quantile regression. With the help of these oracle inequalities we then derive learning rates that are (essentially) minmax optimal under standard smoothness assumptions on the target function. We further utilize the oracle inequalities to show that these learning rates can be adaptively achieved by a simple data-dependent parameter selection method that splits the data set into a training and a validation set.

[1]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[2]  J. Cooper SINGULAR INTEGRALS AND DIFFERENTIABILITY PROPERTIES OF FUNCTIONS , 1973 .

[3]  H. Johnen,et al.  On the equivalence of the K-functional and moduli of continuity and some applications , 1976, Constructive Theory of Functions of Several Variables.

[4]  R. DeVore,et al.  Quantitative Korovkin theorems for positive linear operators on _{}-spaces , 1978 .

[5]  H. Triebel Theory Of Function Spaces , 1983 .

[6]  R. DeVore,et al.  Interpolation of Besov-Spaces , 1988 .

[7]  B. Carl,et al.  Entropy, Compactness and the Approximation of Operators , 1990 .

[8]  P. Chaudhuri Global nonparametric estimation of conditional quantile functions and their derivatives , 1991 .

[9]  George G. Lorentz,et al.  Constructive Approximation , 1993, Grundlehren der mathematischen Wissenschaften.

[10]  A. Magnus Constructive Approximation, Grundlehren der mathematischen Wissenschaften, Vol. 303, R. A. DeVore and G. G. Lorentz, Springer-Verlag, 1993, x + 449 pp. , 1994 .

[11]  H. Triebel,et al.  Function Spaces, Entropy Numbers, Differential Operators: Function Spaces , 1996 .

[12]  Xiaotong Shen ON THE METHOD OF PENALIZATION , 1998 .

[13]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[14]  Felipe Cucker,et al.  On the mathematical foundations of learning , 2001 .

[15]  Adam Krzyzak,et al.  A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.

[16]  Dustin Boswell,et al.  Introduction to Support Vector Machines , 2002 .

[17]  So K Kb EFFICIENT SEMIPARAMETRIC ESTIMATION OF A PARTIALLY LINEAR QUANTILE REGRESSION MODEL , 2003 .

[18]  S. Smale,et al.  ESTIMATING THE APPROXIMATION ERROR IN LEARNING THEORY , 2003 .

[19]  S. Keerthi,et al.  SMO Algorithm for Least-Squares SVM Formulations , 2003, Neural Computation.

[20]  Yiming Ying,et al.  Support Vector Machine Soft Margin Classifiers: Error Analysis , 2004, J. Mach. Learn. Res..

[21]  A. Berlinet,et al.  Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[22]  Lorenzo Rosasco,et al.  Model Selection for Regularized Least-Squares Algorithm in Learning Theory , 2005, Found. Comput. Math..

[23]  Alexander J. Smola,et al.  Nonparametric Quantile Estimation , 2006, J. Mach. Learn. Res..

[24]  Yiming Ying,et al.  Learning Rates of Least-Square Regularized Regression , 2006, Found. Comput. Math..

[25]  Vladimir Temlyakov,et al.  Optimal estimators in learning theory , 2006 .

[26]  Yiming Ying,et al.  Learnability of Gaussians with Flexible Variances , 2007, J. Mach. Learn. Res..

[27]  Ji Zhu,et al.  Quantile Regression in Reproducing Kernel Hilbert Spaces , 2007 .

[28]  Ingo Steinwart,et al.  Fast rates for support vector machines using Gaussian kernels , 2007, 0708.1838.

[29]  A. Caponnetto,et al.  Optimal Rates for the Regularized Least-Squares Algorithm , 2007, Found. Comput. Math..

[30]  Andreas Christmann,et al.  How SVMs can estimate quantiles and the median , 2007, NIPS.

[31]  S. Smale,et al.  Learning Theory Estimates via Integral Operators and Their Approximations , 2007 .

[32]  F. Vega-Redondo Complex Social Networks: Econometric Society Monographs , 2007 .

[33]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[34]  H. Triebel Theory of Function Spaces III , 2008 .

[35]  Ding-Xuan Zhou,et al.  Learning and approximation by Gaussians on Riemannian manifolds , 2009, Adv. Comput. Math..

[36]  Don R. Hush,et al.  Optimal Rates for Regularized Least Squares Regression , 2009, COLT.

[37]  C. Campbell,et al.  Generalization bounds for learning the kernel , 2009 .

[38]  Dao-Hong Xiang,et al.  Classification with Gaussians and Convex Loss , 2009, J. Mach. Learn. Res..

[39]  S. Mendelson,et al.  Regularization in kernel learning , 2010, 1001.2094.

[40]  Taiji Suzuki,et al.  Unifying Framework for Fast Learning Rate of Non-Sparse Multiple Kernel Learning , 2011, NIPS.

[41]  Ingo Steinwart,et al.  Optimal learning rates for least squares SVMs using Gaussian kernels , 2011, NIPS.

[42]  Ingo Steinwart,et al.  Estimating conditional quantiles with the help of the pinball loss , 2011, 1102.2101.

[43]  G. Burton Sobolev Spaces , 2013 .