Parameter Tuning Using Adaptive Moment Estimation in Deep Learning Neural Networks
暂无分享,去创建一个
[1] Ji Feng,et al. Deep Forest: Towards An Alternative to Deep Neural Networks , 2017, IJCAI.
[2] Andrea Montanari,et al. A mean field view of the landscape of two-layer neural networks , 2018, Proceedings of the National Academy of Sciences.
[3] Anna Dembinska,et al. Computing moments of discrete order statistics from non-identical distributions , 2018, J. Comput. Appl. Math..
[4] Neil D. Lawrence,et al. Deep Gaussian Processes , 2012, AISTATS.
[5] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.
[6] Emmanuel Okewu,et al. Experimental Comparison of Stochastic Optimizers in Deep Learning , 2019, ICCSA.
[7] Kenneth Heafield,et al. Combining Global Sparse Gradients with Local Gradients in Distributed Neural Network Training , 2019, EMNLP/IJCNLP.
[8] Donghwan Kim,et al. Optimized first-order methods for smooth convex minimization , 2014, Mathematical Programming.
[9] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[10] Ning Qian,et al. On the momentum term in gradient descent learning algorithms , 1999, Neural Networks.
[11] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.
[12] Hiroaki Hayashi,et al. Improving Stochastic Gradient Descent with Feedback , 2016, ArXiv.
[13] Kenneth Heafield,et al. Combining Global Sparse Gradients with Local Gradients , 2018 .
[14] Yann LeCun,et al. Deep learning with Elastic Averaging SGD , 2014, NIPS.
[15] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[16] John Moody,et al. Learning rate schedules for faster stochastic gradient search , 1992, Neural Networks for Signal Processing II Proceedings of the 1992 IEEE Workshop.