moDNN: Memory Optimal Deep Neural Network Training on Graphics Processing Units
暂无分享,去创建一个
Xiaoming Chen | Xiaobo Sharon Hu | Yinhe Han | Danny Ziyi Chen | D. Chen | X. Hu | Yinhe Han | Xiaoming Chen
[1] Natalia Gimelshein,et al. vDNN: Virtualized deep neural networks for scalable, memory-efficient neural network design , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[2] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Mordecai Avriel,et al. Nonlinear programming , 1976 .
[4] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[5] Yann LeCun,et al. Fast Training of Convolutional Networks through FFTs , 2013, ICLR.
[6] Javier Romero,et al. Coupling Adaptive Batch Sizes with Learning Rates , 2016, UAI.
[7] Zheng Zhang,et al. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.
[8] Ran El-Yaniv,et al. Binarized Neural Networks , 2016, NIPS.
[9] Ronald L. Rivest,et al. Introduction to Algorithms , 1990 .
[10] H. T. Kung,et al. I/O complexity: The red-blue pebble game , 1981, STOC '81.
[11] Igor Carron,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016 .
[12] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Quoc V. Le,et al. On optimization methods for deep learning , 2011, ICML.
[14] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[15] Tianqi Chen,et al. Training Deep Nets with Sublinear Memory Cost , 2016, ArXiv.
[16] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.
[17] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[18] Hassan Foroosh,et al. Sparse Convolutional Neural Networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Ronald L. Rivest,et al. Introduction to Algorithms, 3rd Edition , 2009 .
[20] Xiaoming Chen,et al. moDNN: Memory optimal DNN training on GPUs , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[21] Yoav Goldberg,et al. A Primer on Neural Network Models for Natural Language Processing , 2015, J. Artif. Intell. Res..
[22] Zenglin Xu,et al. Efficient Communications in Training Large Scale Neural Networks , 2017, ACM Multimedia.
[23] Clément Farabet,et al. Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.
[24] John Tran,et al. cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.
[25] Jeffrey Scott Vitter,et al. External memory algorithms and data structures: dealing with massive data , 2001, CSUR.
[26] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.
[27] Mohak Shah,et al. Comparative Study of Deep Learning Software Frameworks , 2015, 1511.06435.
[28] Razvan Pascanu,et al. Theano: A CPU and GPU Math Compiler in Python , 2010, SciPy.
[29] Sachin S. Talathi,et al. Fixed Point Quantization of Deep Convolutional Networks , 2015, ICML.
[30] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[31] Carter Bays,et al. A comparison of next-fit, first-fit, and best-fit , 1977, CACM.
[32] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.
[33] Andrew Lavin,et al. Fast Algorithms for Convolutional Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[35] Quanquan C. Liu. Red-blue and standard pebble games : complexity and applications in the sequential and parallel models , 2017 .
[36] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[37] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[38] Alex Graves,et al. Memory-Efficient Backpropagation Through Time , 2016, NIPS.
[39] C. Charalambous,et al. Conjugate gradient algorithm for efficient training of artifi-cial neural networks , 1990 .
[40] Peter B. Galvin,et al. Operating System Concepts, 4th Ed. , 1993 .