Minnorm training: an algorithm for training overcomplete deep neural networks
暂无分享,去创建一个
[1] C. Lee Giles,et al. What Size Neural Network Gives Optimal Generalization? Convergence Properties of Backpropagation , 1998 .
[2] Matus Telgarsky,et al. Spectrally-normalized margin bounds for neural networks , 2017, NIPS.
[3] D. Lathrop. Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry, and Engineering , 2015 .
[4] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[5] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..
[6] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[7] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..
[8] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[9] Pierre Baldi,et al. Temporal Evolution of Generalization during Learning in Linear Networks , 1991, Neural Computation.
[10] Shai Shalev-Shwartz,et al. SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data , 2017, ICLR.
[11] Andrew M. Saxe,et al. High-dimensional dynamics of generalization error in neural networks , 2017, Neural Networks.
[12] Nathan Srebro,et al. Exploring Generalization in Deep Learning , 2017, NIPS.
[13] Nathan Srebro,et al. Convergence of Gradient Descent on Separable Data , 2018, AISTATS.
[14] Yi Zhang,et al. Stronger generalization bounds for deep nets via a compression approach , 2018, ICML.
[15] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[16] Kenji Fukumizu,et al. Effect of Batch Learning in Multilayer Neural Networks , 1998, ICONIP.
[17] Yichuan Tang,et al. Deep Learning using Linear Support Vector Machines , 2013, 1306.0239.
[18] Percy Liang,et al. Understanding Black-box Predictions via Influence Functions , 2017, ICML.
[19] Nathan Srebro,et al. The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..
[20] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.
[21] Lawrence K. Saul,et al. Kernel Methods for Deep Learning , 2009, NIPS.
[22] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[23] Eugenio Culurciello,et al. An Analysis of Deep Neural Network Models for Practical Applications , 2016, ArXiv.
[24] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[25] David A. McAllester,et al. A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks , 2017, ICLR.
[26] Gintare Karolina Dziugaite,et al. Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data , 2017, UAI.
[27] Mikhail Belkin,et al. To understand deep learning we need to understand kernel learning , 2018, ICML.
[28] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[29] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.