Complexity control by gradient descent in deep networks
暂无分享,去创建一个
[1] Tim Salimans,et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.
[2] Nathan Srebro,et al. The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..
[3] Lorenzo Rosasco,et al. Learning with Incremental Iterative Regularization , 2014, NIPS.
[4] Nathan Srebro,et al. Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models , 2019, ICML.
[5] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[6] Tomaso Poggio,et al. Double descent in the condition number , 2019, ArXiv.
[7] Kaifeng Lyu,et al. Gradient Descent Maximizes the Margin of Homogeneous Neural Networks , 2019, ICLR.
[8] Tengyuan Liang,et al. Just Interpolate: Kernel "Ridgeless" Regression Can Generalize , 2018, The Annals of Statistics.
[9] Alexander Rakhlin,et al. Consistency of Interpolation with Laplace Kernels is a High-Dimensional Phenomenon , 2018, COLT.
[10] Lorenzo Rosasco,et al. Theory III: Dynamics and Generalization in Deep Networks , 2019, ArXiv.
[11] Lea Fleischer,et al. Regularization of Inverse Problems , 1996 .
[12] Sun-Yuan Kung,et al. On gradient adaptation with unit-norm constraints , 2000, IEEE Trans. Signal Process..
[13] Nathan Srebro,et al. Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate , 2018, AISTATS.
[14] Ruslan Salakhutdinov,et al. Geometry of Optimization and Implicit Regularization in Deep Learning , 2017, ArXiv.