A Cyclical Learning Rate Method in Deep Learning Training
暂无分享,去创建一个
[1] Tatiana Tommasi,et al. Training Deep Networks without Learning Rates Through Coin Betting , 2017, NIPS.
[2] Georg Heigold,et al. An empirical study of learning rates in deep neural networks for speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[3] Yuanzhi Li,et al. A Convergence Theory for Deep Learning via Over-Parameterization , 2018, ICML.
[4] Razvan Pascanu,et al. Sharp Minima Can Generalize For Deep Nets , 2017, ICML.
[5] Bin Yu,et al. Stability and Convergence Trade-off of Iterative Optimization Algorithms , 2018, ArXiv.
[6] Yann LeCun,et al. The Loss Surface of Multilayer Networks , 2014, ArXiv.
[7] Quoc V. Le,et al. Don't Decay the Learning Rate, Increase the Batch Size , 2017, ICLR.
[8] Yoshua Bengio,et al. ADASECANT: Robust Adaptive Secant Method for Stochastic Gradient , 2014, ArXiv.
[9] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[10] Harm de Vries,et al. RMSProp and equilibrated adaptive learning rates for non-convex optimization. , 2015 .
[11] Jun Yang,et al. Auto-Ensemble: An Adaptive Learning Rate Scheduling Based Deep Learning Model Ensembling , 2020, IEEE Access.
[12] Yoshua Bengio,et al. Three Factors Influencing Minima in SGD , 2017, ArXiv.
[13] Warren B. Powell,et al. Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming , 2006, Machine Learning.
[14] Nathan Srebro,et al. Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate , 2018, AISTATS.
[15] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[16] Leslie N. Smith,et al. Cyclical Learning Rates for Training Neural Networks , 2015, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).