Failures of model-dependent generalization bounds for least-norm interpolation
暂无分享,去创建一个
[1] P. Bartlett,et al. Benign overfitting in ridge regression , 2020, J. Mach. Learn. Res..
[2] O. Papaspiliopoulos. High-Dimensional Probability: An Introduction with Applications in Data Science , 2020 .
[3] Dino Sejdinovic,et al. Benign Overfitting and Noisy Features , 2020, ArXiv.
[4] Michael W. Mahoney,et al. Exact expressions for double descent and implicit regularization via surrogate random design , 2019, NeurIPS.
[5] Daniel M. Roy,et al. In Defense of Uniform Convergence: Generalization via derandomization with an application to interpolating predictors , 2019, ICML.
[6] Philip M. Long,et al. Benign overfitting in linear regression , 2019, Proceedings of the National Academy of Sciences.
[7] Vitaly Feldman,et al. Does learning require memorization? a short tale about a long tail , 2019, STOC.
[8] Yuan Cao,et al. Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks , 2019, NeurIPS.
[9] Philip M. Long,et al. Generalization bounds for deep convolutional neural networks , 2019, ICLR.
[10] T. Hastie,et al. Surprises in High-Dimensional Ridgeless Least Squares Interpolation , 2019, Annals of statistics.
[11] J. Zico Kolter,et al. Uniform convergence may be unable to explain generalization in deep learning , 2019, NeurIPS.
[12] Ruosong Wang,et al. Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks , 2019, ICML.
[13] Mikhail Belkin,et al. Reconciling modern machine-learning practice and the classical bias–variance trade-off , 2018, Proceedings of the National Academy of Sciences.
[14] Yuanzhi Li,et al. Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data , 2018, NeurIPS.
[15] Tengyuan Liang,et al. Just Interpolate: Kernel "Ridgeless" Regression Can Generalize , 2018, The Annals of Statistics.
[16] Mikhail Belkin,et al. Does data interpolation contradict statistical optimality? , 2018, AISTATS.
[17] O. Shamir,et al. Size-Independent Sample Complexity of Neural Networks , 2017, COLT.
[18] Matus Telgarsky,et al. Spectrally-normalized margin bounds for neural networks , 2017, NIPS.
[19] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[20] Ryota Tomioka,et al. Norm-Based Capacity Control in Neural Networks , 2015, COLT.
[21] Warren D. Smith,et al. Testing Closeness of Discrete Distributions , 2010, JACM.
[22] Sidney Addelman,et al. trans-Dimethanolbis(1,1,1-trifluoro-5,5-dimethylhexane-2,4-dionato)zinc(II) , 2008, Acta crystallographica. Section E, Structure reports online.
[23] Santosh S. Vempala,et al. The geometry of logconcave functions and sampling algorithms , 2007, Random Struct. Algorithms.
[24] Max Buot. Probability and Computing: Randomized Algorithms and Probabilistic Analysis , 2006 .
[25] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..
[26] Ronitt Rubinfeld,et al. Testing that distributions are close , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.
[27] S. Boucheron,et al. Model Selection and Error Estimation , 2000, Machine Learning.
[28] Peter L. Bartlett,et al. The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.
[29] David Haussler,et al. Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..
[30] V. V. Buldygin,et al. Sub-Gaussian random variables , 1980 .
[31] E. Slud. Distribution Inequalities for the Binomial Law , 1977 .
[32] W. Feller. An Introduction to Probability Theory and Its Applications , 1959 .
[33] Eli Upfal,et al. Probability and Computing: Randomized Algorithms and Probabilistic Analysis , 2005 .
[34] M. W. Birch. Maximum Likelihood in Three-Way Contingency Tables , 1963 .
[35] B. Harshbarger. An Introduction to Probability Theory and its Applications, Volume I , 1958 .