暂无分享,去创建一个
[1] Nathan Srebro,et al. Implicit Regularization in Matrix Factorization , 2017, 2018 Information Theory and Applications Workshop (ITA).
[2] Yuanzhi Li,et al. Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers , 2018, NeurIPS.
[3] Samy Bengio,et al. Understanding deep learning (still) requires rethinking generalization , 2021, Commun. ACM.
[4] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[5] Yuanzhi Li,et al. On the Convergence Rate of Training Recurrent Neural Networks , 2018, NeurIPS.
[6] Noga Alon,et al. The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.
[7] Nathan Srebro,et al. Characterizing Implicit Bias in Terms of Optimization Geometry , 2018, ICML.
[8] Yi Zhou,et al. SGD Converges to Global Minimum in Deep Learning via Star-convex Path , 2019, ICLR.
[9] Nathan Srebro,et al. The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..
[10] Yoram Singer,et al. Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..
[11] Raef Bassily,et al. Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses , 2020, NeurIPS.
[12] Vladimir Braverman,et al. Benign Overfitting of Constant-Stepsize SGD for Linear Regression , 2021, COLT.
[13] Roi Livni,et al. SGD Generalizes Better Than GD (And Regularization Doesn't Help) , 2021, COLT.
[14] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .
[15] Sanjeev Arora,et al. Implicit Regularization in Deep Matrix Factorization , 2019, NeurIPS.
[16] Nathan Srebro,et al. Implicit Bias of Gradient Descent on Linear Convolutional Networks , 2018, NeurIPS.
[17] Yuanzhi Li,et al. An Alternative View: When Does SGD Escape Local Minima? , 2018, ICML.
[18] Matus Telgarsky,et al. Risk and parameter convergence of logistic regression , 2018, ArXiv.
[19] Sanjeev Arora,et al. On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization , 2018, ICML.
[20] Nadav Cohen,et al. Implicit Regularization in Deep Learning May Not Be Explainable by Norms , 2020, NeurIPS.
[21] Roi Livni,et al. Can Implicit Bias Explain Generalization? Stochastic Convex Optimization as a Case Study , 2020, NeurIPS.
[22] D. Panchenko. Some Extensions of an Inequality of Vapnik and Chervonenkis , 2002, math/0405342.
[23] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[24] Ohad Shamir,et al. Stochastic Convex Optimization , 2009, COLT.
[25] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[26] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[27] Vitaly Feldman,et al. Generalization of ERM in Stochastic Convex Optimization: The Dimension Strikes Back , 2016, NIPS.
[28] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.
[29] Raef Bassily,et al. The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning , 2017, ICML.
[30] Yuanzhi Li,et al. Can SGD Learn Recurrent Neural Networks with Provable Generalization? , 2019, NeurIPS.