Rethinking the Value of Asynchronous Solvers for Distributed Deep Learning
暂无分享,去创建一个
[1] Forrest N. Iandola,et al. FireCaffe: Near-Linear Acceleration of Deep Neural Network Training on Compute Clusters , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Yang You,et al. Scaling SGD Batch Size to 32K for ImageNet Training , 2017, ArXiv.
[3] Razvan Pascanu,et al. Distilling Policy Distillation , 2019, AISTATS.
[4] John Langford,et al. Slow Learners are Fast , 2009, NIPS.
[5] Yann LeCun,et al. Deep learning with Elastic Averaging SGD , 2014, NIPS.
[6] James Demmel,et al. ImageNet Training in Minutes , 2017, ICPP.
[7] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[8] Yang You,et al. Large Batch Training of Convolutional Networks , 2017, 1708.03888.
[9] Masafumi Yamazaki,et al. Yet Another Accelerated SGD: ResNet-50 Training on ImageNet in 74.7 seconds , 2019, ArXiv.
[10] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[11] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[12] Shane Legg,et al. Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.
[13] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[14] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[15] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[16] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[17] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[18] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[19] James Demmel,et al. Scaling Deep Learning on GPU and Knights Landing clusters , 2017, SC17: International Conference for High Performance Computing, Networking, Storage and Analysis.
[20] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[23] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[25] Takuya Akiba,et al. Extremely Large Minibatch SGD: Training ResNet-50 on ImageNet in 15 Minutes , 2017, ArXiv.
[26] Pongsakorn U.-Chupala,et al. ImageNet/ResNet-50 Training in 224 Seconds , 2018, ArXiv.