Massively Distributed SGD: ImageNet/ResNet-50 Training in a Flash
暂无分享,去创建一个
Hiroaki Mikami | Hisahiro Suganuma | Pongsakorn U.-Chupala | Yoshiki Tanaka | Yuichi Kageyama | Pongsakorn U-chupala
[1] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[2] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[5] Yang You,et al. Large Batch Training of Convolutional Networks , 2017, 1708.03888.
[6] Takuya Akiba,et al. Extremely Large Minibatch SGD: Training ResNet-50 on ImageNet in 15 Minutes , 2017, ArXiv.
[7] Elad Hoffer,et al. Train longer, generalize better: closing the generalization gap in large batch training of neural networks , 2017, NIPS.
[8] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[9] J. Demmel,et al. ImageNet Training in 24 Minutes , 2017 .
[10] Michael Garland,et al. AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks , 2017, ArXiv.
[11] Kurt Keutzer,et al. Large batch size training of neural networks with adversarial training and second-order information , 2018, ArXiv.
[12] Hao Wu,et al. Mixed Precision Training , 2017, ICLR.
[13] James Demmel,et al. ImageNet Training in Minutes , 2017, ICPP.
[14] Quoc V. Le,et al. Don't Decay the Learning Rate, Increase the Batch Size , 2017, ICLR.
[15] Yuanzhou Yang,et al. Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes , 2018, ArXiv.
[16] Quoc V. Le,et al. A Bayesian Perspective on Generalization and Stochastic Gradient Descent , 2017, ICLR.
[17] Tao Wang,et al. Image Classification at Supercomputer Scale , 2018, ArXiv.