Optimal Learning with Anisotropic Gaussian SVMs

This paper investigates the nonparametric regression problem using SVMs with anisotropic Gaussian RBF kernels. Under the assumption that the target functions are resided in certain anisotropic Besov spaces, we establish the almost optimal learning rates, more precisely, optimal up to some logarithmic factor, presented by the effective smoothness. By taking the effective smoothness into consideration, our almost optimal learning rates are faster than those obtained with the underlying RKHSs being certain anisotropic Sobolev spaces. Moreover, if the target function depends only on fewer dimensions, faster learning rates can be further achieved.

[1]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[2]  Yiming Yang,et al.  From Lasso regression to Feature vector machine , 2005, NIPS.

[3]  Felipe Cucker,et al.  Learning Theory: An Approximation Theory Viewpoint: Index , 2007 .

[4]  B. Carl,et al.  Entropy, Compactness and the Approximation of Operators , 1990 .

[5]  Richard Weber,et al.  Simultaneous feature selection and classification using kernel-penalized support vector machines , 2011, Inf. Sci..

[6]  M. Nussbaum Spline Smoothing in Regression Models and Asymptotic Efficiency in $L_2$ , 1985 .

[7]  Genevera I. Allen Automatic Feature Selection via Weighted Kernels and Regularization , 2013 .

[8]  C. J. Stone,et al.  Optimal Global Rates of Convergence for Nonparametric Regression , 1982 .

[9]  Lawrence K. Saul,et al.  Kernel Methods for Deep Learning , 2009, NIPS.

[10]  A. Sowmya,et al.  The anisotropic Gaussian kernel for SVM classification of HRCT images of the lung , 2004, Proceedings of the 2004 Intelligent Sensors, Sensor Networks and Information Processing Conference, 2004..

[11]  H. Triebel Theory Of Function Spaces , 1983 .

[12]  Bernhard Schölkopf,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[13]  Sancho Salcedo-Sanz,et al.  Evolutionary optimization of multi-parametric kernel $$\epsilon$$-SVMr for forecasting problems , 2013, Soft Comput..

[14]  Ingo Steinwart,et al.  Optimal regression rates for SVMs using Gaussian kernels , 2013 .

[15]  G. Burton Sobolev Spaces , 2013 .

[16]  Vladimir Katkovnik,et al.  Nonparametric density estimation with adaptive varying window size , 2001, SPIE Remote Sensing.

[17]  Marc Hoffmann Random rates in anisotropic regression , 2002 .

[18]  H. Johnen,et al.  On the equivalence of the K-functional and moduli of continuity and some applications , 1976, Constructive Theory of Functions of Several Variables.

[19]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[20]  L. Birge,et al.  On estimating a density using Hellinger distance and some other strange facts , 1986 .

[21]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[22]  Peng-Lang Shui,et al.  Noise-robust edge detector combining isotropic and anisotropic Gaussian kernels , 2012, Pattern Recognit..

[23]  Anisotropic Spaces I. (Interpolation of Abstract Spaces and Function Spaces) , 1976 .

[24]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[25]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[26]  Lorenzo Rosasco,et al.  Deep Convolutional Networks are Hierarchical Kernel Machines , 2015, ArXiv.

[27]  Debdeep Pati,et al.  ANISOTROPIC FUNCTION ESTIMATION USING MULTI-BANDWIDTH GAUSSIAN PROCESSES. , 2011, Annals of statistics.

[28]  Don R. Hush,et al.  Optimal Rates for Regularized Least Squares Regression , 2009, COLT.

[29]  M. Farooq,et al.  An SVM-like approach for expectile regression , 2015, Comput. Stat. Data Anal..

[30]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.