Accelerating Deep Learning Frameworks with Micro-Batches
暂无分享,去创建一个
Torsten Hoefler | Yosuke Oyama | Satoshi Matsuoka | Tal Ben-Nun | T. Hoefler | S. Matsuoka | Tal Ben-Nun | Yosuke Oyama
[1] Atsushi Ike,et al. Memory reduction method for deep neural network training , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).
[2] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[3] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[4] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[5] Alex Krizhevsky,et al. One weird trick for parallelizing convolutional neural networks , 2014, ArXiv.
[6] Yann LeCun,et al. Fast Training of Convolutional Networks through FFTs , 2013, ICLR.
[7] Quoc V. Le,et al. Don't Decay the Learning Rate, Increase the Batch Size , 2017, ICLR.
[8] John Salvatier,et al. Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.
[9] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[10] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[12] Yang Yi,et al. Optimizing Memory Efficiency for Deep Convolutional Neural Networks on GPUs , 2016 .
[13] Kenta Oono,et al. Chainer : a Next-Generation Open Source Framework for Deep Learning , 2015 .
[14] Andrew Lavin,et al. Fast Algorithms for Convolutional Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Eiji Oki,et al. GLPK (GNU Linear Programming Kit) , 2012 .
[16] Takuya Akiba,et al. Extremely Large Minibatch SGD: Training ResNet-50 on ImageNet in 15 Minutes , 2017, ArXiv.
[17] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[18] Natalia Gimelshein,et al. vDNN: Virtualized deep neural networks for scalable, memory-efficient neural network design , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[19] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] H. Sebastian Seung,et al. ZNNi: Maximizing the Inference Throughput of 3D Convolutional Networks on CPUs and GPUs , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.