Generalization Puzzles in Deep Networks
暂无分享,去创建一个
Lorenzo Rosasco | Brando Miranda | Qianli Liao | Andrzej Banburski | Jack Hidary | Tomaso Poggio | Robert Liang | T. Poggio | L. Rosasco | B. Miranda | Q. Liao | J. Hidary | Andrzej Banburski | Robert Liang
[1] Tomaso A. Poggio,et al. Theory II: Landscape of the Empirical Risk in Deep Learning , 2017, ArXiv.
[2] Michael I. Jordan,et al. Convexity, Classification, and Risk Bounds , 2006 .
[3] Wei Hu,et al. Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced , 2018, NeurIPS.
[4] Peter L. Bartlett,et al. Neural Network Learning - Theoretical Foundations , 1999 .
[5] Stefano Soatto,et al. Entropy-SGD: biasing gradient descent into wide valleys , 2016, ICLR.
[6] Nathan Srebro,et al. Exploring Generalization in Deep Learning , 2017, NIPS.
[7] Lorenzo Rosasco,et al. Theory of Deep Learning III: explaining the non-overfitting puzzle , 2017, ArXiv.
[8] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[9] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[10] Razvan Pascanu,et al. Sharp Minima Can Generalize For Deep Nets , 2017, ICML.
[11] John Shawe-Taylor,et al. Generalization Performance of Support Vector Machines and Other Pattern Classifiers , 1999 .
[12] Gábor Lugosi,et al. Introduction to Statistical Learning Theory , 2004, Advanced Lectures on Machine Learning.
[13] Nathan Srebro,et al. Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate , 2018, AISTATS.
[14] Stefano Soatto,et al. Stochastic Gradient Descent Performs Variational Inference, Converges to Limit Cycles for Deep Networks , 2017, 2018 Information Theory and Applications Workshop (ITA).
[15] Mikhail Belkin,et al. To understand deep learning we need to understand kernel learning , 2018, ICML.
[16] Matus Telgarsky,et al. Spectrally-normalized margin bounds for neural networks , 2017, NIPS.
[17] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Lorenzo Rosasco,et al. Theory III: Dynamics and Generalization in Deep Networks , 2019, ArXiv.
[19] Michael W. Mahoney,et al. Traditional and Heavy-Tailed Self Regularization in Neural Network Models , 2019, ICML.
[20] Tomaso A. Poggio,et al. Theory IIIb: Generalization in Deep Networks , 2018, ArXiv.
[21] Tomaso A. Poggio,et al. Fisher-Rao Metric, Geometry, and Complexity of Neural Networks , 2017, AISTATS.
[22] Tomaso A. Poggio,et al. Theory of Deep Learning IIb: Optimization Properties of SGD , 2018, ArXiv.
[23] Nathan Srebro,et al. The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..