Complexity control by gradient descent in deep networks
暂无分享,去创建一个
[1] Tomaso Poggio,et al. Double descent in the condition number , 2019, ArXiv.
[2] Kaifeng Lyu,et al. Gradient Descent Maximizes the Margin of Homogeneous Neural Networks , 2019, ICLR.
[3] Nathan Srebro,et al. Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models , 2019, ICML.
[4] Lorenzo Rosasco,et al. Theory III: Dynamics and Generalization in Deep Networks , 2019, ArXiv.
[5] Alexander Rakhlin,et al. Consistency of Interpolation with Laplace Kernels is a High-Dimensional Phenomenon , 2018, COLT.
[6] Tengyuan Liang,et al. Just Interpolate: Kernel "Ridgeless" Regression Can Generalize , 2018, The Annals of Statistics.
[7] Nathan Srebro,et al. Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate , 2018, AISTATS.
[8] Nathan Srebro,et al. The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..
[9] Ruslan Salakhutdinov,et al. Geometry of Optimization and Implicit Regularization in Deep Learning , 2017, ArXiv.
[10] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[11] Tim Salimans,et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.
[12] Lorenzo Rosasco,et al. Learning with Incremental Iterative Regularization , 2014, NIPS.
[13] Sun-Yuan Kung,et al. On gradient adaptation with unit-norm constraints , 2000, IEEE Trans. Signal Process..
[14] Lea Fleischer,et al. Regularization of Inverse Problems , 1996 .