暂无分享,去创建一个
Fred Zhang | Boaz Barak | Dimitris Kalimeris | Gal Kaplun | Preetum Nakkiran | Benjamin L. Edelman | Tristan Yang | Benjamin L. Edelman | B. Barak | Preetum Nakkiran | Gal Kaplun | Tristan Yang | Fred Zhang | Benjamin Edelman | Dimitris Kalimeris
[1] Nathan Srebro,et al. Implicit Regularization in Matrix Factorization , 2017, 2018 Information Theory and Applications Workshop (ITA).
[2] Yoshua Bengio,et al. On the Spectral Bias of Deep Neural Networks , 2018, ArXiv.
[3] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[4] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[5] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[6] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[7] Ryota Tomioka,et al. Norm-Based Capacity Control in Neural Networks , 2015, COLT.
[8] Nathan Srebro,et al. The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..
[9] Gintare Karolina Dziugaite,et al. Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data , 2017, UAI.
[10] Shai Shalev-Shwartz,et al. SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data , 2017, ICLR.
[11] Peter L. Bartlett,et al. Neural Network Learning - Theoretical Foundations , 1999 .
[12] Yoshua Bengio,et al. A Closer Look at Memorization in Deep Networks , 2017, ICML.
[13] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..
[15] J. Zico Kolter,et al. Uniform convergence may be unable to explain generalization in deep learning , 2019, NeurIPS.
[16] Jascha Sohl-Dickstein,et al. Sensitivity and Generalization in Neural Networks: an Empirical Study , 2018, ICLR.
[17] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[18] Nathan Srebro,et al. Implicit Bias of Gradient Descent on Linear Convolutional Networks , 2018, NeurIPS.
[19] Geoffrey E. Hinton,et al. Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.
[20] Ohad Shamir,et al. Size-Independent Sample Complexity of Neural Networks , 2017, COLT.
[21] Matus Telgarsky,et al. Spectrally-normalized margin bounds for neural networks , 2017, NIPS.
[22] Zhi-Qin John Xu,et al. Understanding training and generalization in deep learning by Fourier analysis , 2018, ArXiv.
[23] William J. McGill. Multivariate information transmission , 1954, Trans. IRE Prof. Group Inf. Theory.
[24] Gene H. Golub,et al. Matrix computations , 1983 .
[25] Yifan Wu,et al. Towards Understanding the Generalization Bias of Two Layer Convolutional Linear Classifiers with Gradient Descent , 2019, AISTATS.
[26] Yi Zhang,et al. Stronger generalization bounds for deep nets via a compression approach , 2018, ICML.
[27] Victor Veitch,et al. PAC-BAYESIAN COMPRESSION APPROACH , 2019 .
[28] Hongyang Zhang,et al. Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations , 2017, COLT.
[29] Chico Q. Camargo,et al. Deep learning generalizes because the parameter-function map is biased towards simple functions , 2018, ICLR.
[30] Yuanzhi Li,et al. Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers , 2018, NeurIPS.
[31] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[32] A. J. Bell. THE CO-INFORMATION LATTICE , 2003 .
[33] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[34] Yoram Singer,et al. Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.
[35] Dimitris Achlioptas,et al. Bad Global Minima Exist and SGD Can Reach Them , 2019, NeurIPS.
[36] Matus Telgarsky,et al. The implicit bias of gradient descent on nonseparable data , 2019, COLT.
[37] Naftali Tishby,et al. Opening the Black Box of Deep Neural Networks via Information , 2017, ArXiv.
[38] Yuanzhi Li,et al. Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data , 2018, NeurIPS.
[39] David A. McAllester,et al. A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks , 2017, ICLR.
[40] Christoph H. Lampert,et al. Data-Dependent Stability of Stochastic Gradient Descent , 2017, ICML.
[41] Nathan Srebro,et al. Exploring Generalization in Deep Learning , 2017, NIPS.