暂无分享,去创建一个
[1] Zhu Li,et al. Towards a Unified Analysis of Random Fourier Features , 2018, ICML.
[2] Cheng Wang,et al. Optimal learning rates for least squares regularized regression with unbounded sampling , 2011, J. Complex..
[3] Tengyu Ma,et al. Optimal Regularization Can Mitigate Double Descent , 2020, ICLR.
[4] M. Lerasle,et al. Benign overfitting in the large deviation regime , 2020, 2003.05838.
[5] Yi Ma,et al. Rethinking Bias-Variance Trade-off for Generalization of Neural Networks , 2020, ICML.
[6] Mohamed-Slim Alouini,et al. Risk Convergence of Centered Kernel Ridge Regression with Large Dimensional Data , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Andreas Christmann,et al. Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.
[8] S. Smale,et al. Learning Theory Estimates via Integral Operators and Their Approximations , 2007 .
[9] Noureddine El Karoui,et al. The spectrum of kernel random matrices , 2010, 1001.0492.
[10] Ingo Steinwart,et al. Sobolev Norm Learning Rates for Regularized Least-Squares Algorithms , 2017, J. Mach. Learn. Res..
[11] Gilles Blanchard,et al. Optimal learning rates for Kernel Conjugate Gradient regression , 2010, NIPS.
[12] Arthur Jacot,et al. Kernel Alignment Risk Estimator: Risk Prediction from Training Data , 2020, NeurIPS.
[13] Sanjiv Kumar,et al. Orthogonal Random Features , 2016, NIPS.
[14] R. Shah,et al. Least Squares Support Vector Machines , 2022 .
[15] J. Zico Kolter,et al. A Continuous-Time View of Early Stopping for Least Squares Regression , 2018, AISTATS.
[16] Lorenzo Rosasco,et al. Elastic-net regularization in learning theory , 2008, J. Complex..
[17] Lorenzo Rosasco,et al. Generalization Properties of Learning with Random Features , 2016, NIPS.
[18] Dmitry Kobak,et al. The Optimal Ridge Penalty for Real-world High-dimensional Data Can Be Zero or Negative due to the Implicit Ridge Regularization , 2020, J. Mach. Learn. Res..
[19] Mikhail Belkin,et al. Reconciling modern machine-learning practice and the classical bias–variance trade-off , 2018, Proceedings of the National Academy of Sciences.
[20] Stefan Wager,et al. High-Dimensional Asymptotics of Prediction: Ridge Regression and Classification , 2015, 1507.03003.
[21] Ameya Velingker,et al. Random Fourier Features for Kernel Ridge Regression: Approximation Bounds and Statistical Guarantees , 2018, ICML.
[22] Lorenzo Rosasco,et al. On the Sample Complexity of Subspace Learning , 2013, NIPS.
[23] Richard G. Baraniuk,et al. The Implicit Regularization of Ordinary Least Squares Ensembles , 2020, AISTATS.
[24] Andrea Montanari,et al. The Generalization Error of Random Features Regression: Precise Asymptotics and the Double Descent Curve , 2019, Communications on Pure and Applied Mathematics.
[25] Yoram Singer,et al. Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.
[26] Blake Bordelon,et al. Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks , 2020, ICML.
[27] Ding-Xuan Zhou,et al. Distributed Learning with Regularized Least Squares , 2016, J. Mach. Learn. Res..
[28] Lorenzo Rosasco,et al. Interpolation and Learning with Scale Dependent Kernels , 2020, ArXiv.
[29] Lorenzo Rosasco,et al. Asymptotics of Ridge(less) Regression under General Source Condition , 2020, AISTATS.
[30] Ding-Xuan Zhou,et al. Learning Theory: An Approximation Theory Viewpoint , 2007 .
[31] Stephane Chretien,et al. A finite sample analysis of the double descent phenomenon for ridge function estimation , 2020, ArXiv.
[32] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[33] Michael W. Mahoney,et al. Exact expressions for double descent and implicit regularization via surrogate random design , 2019, NeurIPS.
[34] Mikhail Belkin,et al. Does data interpolation contradict statistical optimality? , 2018, AISTATS.
[35] V. Marčenko,et al. DISTRIBUTION OF EIGENVALUES FOR SOME SETS OF RANDOM MATRICES , 1967 .
[36] Weilin Li. Generalization error of minimum weighted norm and kernel interpolation , 2020, ArXiv.
[37] Tengyuan Liang,et al. On the Multiple Descent of Minimum-Norm Interpolants and Restricted Lower Isometry of Kernels , 2019, COLT.
[38] Felipe Cucker,et al. On the mathematical foundations of learning , 2001 .
[39] Lei Shi,et al. Learning Theory of Distributed Regression with Bias Corrected Regularization Kernel Network , 2017, J. Mach. Learn. Res..
[40] Nello Cristianini,et al. On the Eigenspectrum of the Gram Matrix and Its Relationship to the Operator Eigenspectrum , 2002, ALT.
[41] Francis R. Bach,et al. Sharp analysis of low-rank kernel matrix approximations , 2012, COLT.
[42] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[43] Andrea Montanari,et al. Surprises in High-Dimensional Ridgeless Least Squares Interpolation , 2019, Annals of statistics.
[44] Martin J. Wainwright,et al. Divide and Conquer Kernel Ridge Regression , 2013, COLT.
[45] Blake Bordelon,et al. Spectral Bias and Task-Model Alignment Explain Generalization in Kernel Regression and Infinitely Wide Neural Networks , 2020 .
[46] Ingo Steinwart,et al. Fast rates for support vector machines using Gaussian kernels , 2007, 0708.1838.
[47] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[48] Mohamed-Slim Alouini,et al. Risk Convergence of Centered Kernel Ridge Regression With Large Dimensional Data , 2019, IEEE Transactions on Signal Processing.
[49] Andrea Montanari,et al. Linearized two-layers neural networks in high dimension , 2019, The Annals of Statistics.