Convergence rate of sign stochastic gradient descent for non-convex functions
暂无分享,去创建一个
Kamyar Azizzadenesheli | Anima Anandkumar | Yu-Xiang Wang | Jeremy Bernstein | Jeremy Bernstein | Yu-Xiang Wang | K. Azizzadenesheli | Anima Anandkumar
[1] Michael I. Jordan,et al. Gradient Descent Can Take Exponential Time to Escape Saddle Points , 2017, NIPS.
[2] Oriol Vinyals,et al. Qualitatively characterizing neural network optimization problems , 2014, ICLR.
[3] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[4] Michael I. Jordan,et al. How to Escape Saddle Points Efficiently , 2017, ICML.
[5] Alex Graves,et al. DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.
[6] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[7] Cong Xu,et al. TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning , 2017, NIPS.
[8] Dong Yu,et al. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs , 2014, INTERSPEECH.
[9] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[10] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Nikko Strom,et al. Scalable distributed DNN training using commodity GPU cloud computing , 2015, INTERSPEECH.
[12] Zeyuan Allen Zhu,et al. Natasha: Faster Non-Convex Stochastic Optimization Via Strongly Non-Convex Parameter , 2017, ArXiv.
[13] Yurii Nesterov,et al. Cubic regularization of Newton method and its global performance , 2006, Math. Program..
[14] Martin A. Riedmiller,et al. A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.
[15] Alexander J. Smola,et al. DiFacto: Distributed Factorization Machines , 2016, WSDM.
[16] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.
[17] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.