Layer-Specific Adaptive Learning Rates for Deep Networks
暂无分享,去创建一个
Bharat Singh | Thomas Goldstein | Gavin Taylor | Soham De | Yangmuzi Zhang | Gavin Taylor | Bharat Singh | Soham De | Yangmuzi Zhang | Thomas A. Goldstein
[1] Ming Yang,et al. DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[2] Boris Polyak. Some methods of speeding up the convergence of iteration methods , 1964 .
[3] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[4] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[5] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[6] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[7] A. Bray,et al. Statistics of critical points of Gaussian fields on large-dimensional spaces. , 2006, Physical review letters.
[8] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .
[9] Surya Ganguli,et al. On the saddle point problem for non-convex optimization , 2014, ArXiv.
[10] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[11] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[12] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[13] H. Robbins. A Stochastic Approximation Method , 1951 .
[14] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.
[15] Yan V Fyodorov,et al. Replica Symmetry Breaking Condition Exposed by Random Matrix Calculation of Landscape Complexity , 2007, cond-mat/0702601.