暂无分享,去创建一个
Tomaso Poggio | Andrzej Banburski | Nishka Pant | Fernanda De La Torre | Ishana Shastri | T. Poggio | Andrzej Banburski | Nishka Pant | Fernanda de la Torre | Ishana Shastri
[1] Sayan Mukherjee,et al. Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization , 2006, Adv. Comput. Math..
[2] Tomaso Poggio,et al. Complexity control by gradient descent in deep networks , 2020, Nature Communications.
[3] Yuanzhi Li,et al. A Convergence Theory for Deep Learning via Over-Parameterization , 2018, ICML.
[4] Liwei Wang,et al. Gradient Descent Finds Global Minima of Deep Neural Networks , 2018, ICML.
[5] Qianli Liao,et al. Theoretical issues in deep networks , 2020, Proceedings of the National Academy of Sciences.
[6] Surya Ganguli,et al. Deep Learning on a Data Diet: Finding Important Examples Early in Training , 2021, ArXiv.
[7] Partha Niyogi,et al. Almost-everywhere Algorithmic Stability and Generalization Error , 2002, UAI.
[8] David L. Donoho,et al. Prevalence of neural collapse during the terminal phase of deep learning training , 2020, Proceedings of the National Academy of Sciences.
[9] Qianli Liao,et al. Implicit dynamic regularization in deep networks , 2020 .
[10] Yuanzhi Li,et al. Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data , 2018, NeurIPS.
[11] Yi Zhang,et al. Stronger generalization bounds for deep nets via a compression approach , 2018, ICML.
[12] Stefano Soatto,et al. Entropy-SGD: biasing gradient descent into wide valleys , 2016, ICLR.
[13] Tomaso Poggio,et al. Generalization in deep network classifiers trained with the square loss1 , 2020 .
[14] Tomaso Poggio,et al. Loss landscape: SGD can have a better view than GD , 2020 .
[15] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[16] Gábor Lugosi,et al. Introduction to Statistical Learning Theory , 2004, Advanced Lectures on Machine Learning.
[17] Matthew Botvinick,et al. On the importance of single directions for generalization , 2018, ICLR.
[18] John Langford,et al. (Not) Bounding the True Error , 2001, NIPS.
[19] Tomaso A. Poggio,et al. A Surprising Linear Relationship Predicts Test Performance in Deep Networks , 2018, ArXiv.
[20] Kaifeng Lyu,et al. Gradient Descent Maximizes the Margin of Homogeneous Neural Networks , 2019, ICLR.