Risk Convergence of Centered Kernel Ridge Regression with Large Dimensional Data

This paper carries out a large dimensional analysis of a variation of kernel ridge regression that we call centered kernel ridge regression (CKRR), also known in the literature as kernel ridge regression with offset. This modified technique is obtained by accounting for the bias in the regression problem resulting in the old kernel ridge regression but with centered kernels. The analysis is carried out under the assumption that the data is drawn from a Gaussian distribution and heavily relies on tools from random matrix theory (RMT). Under the regime in which the data dimension and the training size grow infinitely large with fixed ratio and under some mild assumptions controlling the data statistics, we show that both the empirical and the prediction risks converge to a deterministic quantities that describe in closed form fashion the performance of CKRR in terms of the data statistics and dimensions. A key insight of the proposed analysis is the fact that asymptotically a large class of kernels achieve the same minimum prediction risk. This insight is validated with synthetic data.

[1]  Giorgio Valentini,et al.  Bias-Variance Analysis of Support Vector Machines for the Development of SVM-Based Ensemble Methods , 2004, J. Mach. Learn. Res..

[2]  François Laviolette,et al.  Risk Bounds and Learning Algorithms for the Regression Approach to Structured Output Prediction , 2013, ICML.

[3]  Zhenyu Liao,et al.  A Large Dimensional Analysis of Least Squares Support Vector Machines , 2017, IEEE Transactions on Signal Processing.

[4]  Germán Castellanos-Domínguez,et al.  Centered Kernel Alignment Enhancing Neural Network Pretraining for MRI-Based Dementia Diagnosis , 2016, Comput. Math. Methods Medicine.

[5]  Lorenzo Rosasco,et al.  Generalization Properties of Learning with Random Features , 2016, NIPS.

[6]  A. Caponnetto,et al.  Optimal Rates for the Regularized Least-Squares Algorithm , 2007, Found. Comput. Math..

[7]  Mehryar Mohri,et al.  Algorithms for Learning Kernels Based on Centered Alignment , 2012, J. Mach. Learn. Res..

[8]  Jing-Yu Yang,et al.  Multiple kernel clustering based on centered kernel alignment , 2014, Pattern Recognit..

[9]  R. Couillet,et al.  Random Matrix Methods for Wireless Communications: Estimation , 2011 .

[10]  R. Couillet,et al.  Kernel spectral clustering of large dimensional data , 2015, 1510.03547.

[11]  Michael W. Mahoney,et al.  Fast Randomized Kernel Ridge Regression with Statistical Guarantees , 2015, NIPS.

[12]  Daniel J. Hsu,et al.  Kernel ridge vs. principal component regression: Minimax bounds and the qualification of regularization operators , 2017 .

[13]  Noureddine El Karoui,et al.  The spectrum of kernel random matrices , 2010, 1001.0492.