Harmless interpolation in regression and classification with structured features
暂无分享,去创建一个
[1] Richard G. Baraniuk,et al. A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning , 2021, ArXiv.
[2] Zhi-Hua Zhou,et al. Towards an Understanding of Benign Overfitting in Neural Networks , 2021, ArXiv.
[3] Mikhail Belkin,et al. Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation , 2021, Acta Numerica.
[4] Mikhail Belkin,et al. Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures , 2021, NeurIPS.
[5] Mingqi Wu,et al. How rotational invariance of common kernels prevents generalization in high dimensions , 2021, ICML.
[6] Andrea Montanari,et al. Deep learning: a statistical viewpoint , 2021, Acta Numerica.
[7] Andrea Montanari,et al. Generalization error of random feature and kernel methods: hypercontractivity and kernel matrix concentration , 2021, Applied and Computational Harmonic Analysis.
[8] Christos Thrampoulidis,et al. Benign Overfitting in Binary Classification of Gaussian Mixtures , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Jeffrey Pennington,et al. Understanding Double Descent Requires a Fine-Grained Bias-Variance Decomposition , 2020, NeurIPS.
[10] Edgar Dobriban,et al. What causes the test error? Going beyond bias-variance via ANOVA , 2020, J. Mach. Learn. Res..
[11] P. Bartlett,et al. Benign overfitting in ridge regression , 2020, J. Mach. Learn. Res..
[12] O. Papaspiliopoulos. High-Dimensional Probability: An Introduction with Applications in Data Science , 2020 .
[13] Daniel J. Hsu,et al. On the proliferation of support vectors in high dimensions , 2020, AISTATS.
[14] Yue M. Lu,et al. Universality Laws for High-Dimensional Learning With Random Features , 2020, IEEE Transactions on Information Theory.
[15] Yue M. Lu,et al. A Precise Performance Analysis of Learning with Random Features , 2020, ArXiv.
[16] Justin Romberg,et al. Sample complexity and effective dimension for regression on manifolds , 2020, NeurIPS.
[17] Mikhail Belkin,et al. Evaluation of Neural Architectures Trained with Square Loss vs Cross-Entropy in Classification Tasks , 2020, ICLR.
[18] Michael W. Mahoney,et al. A random matrix analysis of random Fourier features: beyond the Gaussian kernel, a precise phase transition, and the corresponding double descent , 2020, NeurIPS.
[19] Mikhail Belkin,et al. Classification vs regression in overparameterized regimes: Does the loss function matter? , 2020, J. Mach. Learn. Res..
[20] Murat A. Erdogdu,et al. Generalization of Two-layer Neural Networks: An Asymptotic Viewpoint , 2020, ICLR.
[21] Philip M. Long,et al. Finite-sample analysis of interpolating linear classifiers in the overparameterized regime , 2020, J. Mach. Learn. Res..
[22] G. A. Young,et al. High‐dimensional Statistics: A Non‐asymptotic Viewpoint, Martin J.Wainwright, Cambridge University Press, 2019, xvii 552 pages, £57.99, hardback ISBN: 978‐1‐1084‐9802‐9 , 2020, International Statistical Review.
[23] G. Biroli,et al. Double Trouble in Double Descent : Bias and Variance(s) in the Lazy Regime , 2020, ICML.
[24] Florent Krzakala,et al. Generalisation error in learning with random features and the hidden manifold model , 2020, ICML.
[25] Tengyuan Liang,et al. On the Multiple Descent of Minimum-Norm Interpolants and Restricted Lower Isometry of Kernels , 2019, COLT.
[26] Andrea Montanari,et al. The Generalization Error of Random Features Regression: Precise Asymptotics and the Double Descent Curve , 2019, Communications on Pure and Applied Mathematics.
[27] Philip M. Long,et al. Benign overfitting in linear regression , 2019, Proceedings of the National Academy of Sciences.
[28] Andrea Montanari,et al. Linearized two-layers neural networks in high dimension , 2019, The Annals of Statistics.
[29] Anant Sahai,et al. Harmless interpolation of noisy data in regression , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).
[30] T. Hastie,et al. Surprises in High-Dimensional Ridgeless Least Squares Interpolation , 2019, Annals of statistics.
[31] Mikhail Belkin,et al. Two models of double descent for weak features , 2019, SIAM J. Math. Data Sci..
[32] Alexander Rakhlin,et al. Consistency of Interpolation with Laplace Kernels is a High-Dimensional Phenomenon , 2018, COLT.
[33] Mikhail Belkin,et al. Reconciling modern machine-learning practice and the classical bias–variance trade-off , 2018, Proceedings of the National Academy of Sciences.
[34] Tengyuan Liang,et al. Just Interpolate: Kernel "Ridgeless" Regression Can Generalize , 2018, The Annals of Statistics.
[35] Mikhail Belkin,et al. To understand deep learning we need to understand kernel learning , 2018, ICML.
[36] Mikhail Belkin,et al. Approximation beats concentration? An approximation view on inference with smooth radial kernels , 2018, COLT.
[37] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[38] Joel A. Tropp,et al. An Introduction to Matrix Concentration Inequalities , 2015, Found. Trends Mach. Learn..
[39] M. Rudelson,et al. Hanson-Wright inequality and sub-gaussian concentration , 2013, 1306.2872.
[40] Ingo Steinwart,et al. Mercer’s Theorem on General Domains: On the Interaction between Measures, Kernels, and RKHSs , 2012 .
[41] A. Caponnetto,et al. Optimal Rates for the Regularized Least-Squares Algorithm , 2007, Found. Comput. Math..
[42] V. Koltchinskii,et al. High Dimensional Probability , 2006, math/0612726.
[43] Tong Zhang,et al. Learning Bounds for Kernel Regression Using Effective Data Dimensionality , 2005, Neural Computation.
[44] A. Tsybakov,et al. Fast learning rates for plug-in classifiers , 2005, 0708.2321.
[45] Vladimir Koltchinskii,et al. Exponential Convergence Rates in Classification , 2005, COLT.
[46] Beyond—bernhard Schölkopf,et al. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.
[47] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..
[48] Charles R. Johnson,et al. Matrix analysis , 1985 .
[49] Christian Reimers. Understanding deep learning , 2023 .
[50] W. Hager,et al. and s , 2019, Shallow Water Hydraulics.
[51] M. Urner. Scattered Data Approximation , 2016 .
[52] W. Marsden. I and J , 2012 .
[53] Don R. Hush,et al. Optimal Rates for Regularized Least Squares Regression , 2009, COLT.
[54] Jerome H. Friedman,et al. On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.
[55] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[56] László Györfi,et al. A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.
[57] L. Ryd,et al. On bias. , 1994, Acta orthopaedica Scandinavica.
[58] and as an in , 2022 .