LALR: Theoretical and Experimental validation of Lipschitz Adaptive Learning Rate in Regression and Neural Networks
暂无分享,去创建一个
Snehanshu Saha | T.S.B Sudarshan | Soma S Dhavala | Tejas Prashanth | Suraj Aralihalli | Sumedh Basarkod
[1] Christoph H. Lampert,et al. Data-Dependent Stability of Stochastic Gradient Descent , 2017, ICML.
[2] I. Dimopoulos,et al. Application of neural networks to modelling nonlinear relationships in ecology , 1996 .
[3] Youngwook Kee,et al. Towards Flatter Loss Surface via Nonmonotonic Learning Rate Scheduling , 2018, UAI.
[4] Krzysztof Sakrejda,et al. Case Study in Evaluating Time Series Prediction Models Using the Relative Mean Absolute Error , 2016, The American statistician.
[5] Raúl Rojas,et al. The Backpropagation Algorithm , 1996 .
[6] Yoram Singer,et al. Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.
[7] Yoshua Bengio,et al. Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.
[8] Jason Yosinski,et al. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Skipper Seabold,et al. Statsmodels: Econometric and Statistical Modeling with Python , 2010, SciPy.
[10] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[11] Sebastian Ruder,et al. An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.
[12] Alexander J. Smola,et al. Nonparametric Quantile Estimation , 2006, J. Mach. Learn. Res..
[13] Cynthia Rudin,et al. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.