Scaling Deep Learning on GPU and Knights Landing clusters
暂无分享,去创建一个
James Demmel | Yang You | Aydin Buluç | Yang You | J. Demmel | A. Buluç
[1] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[2] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.
[3] Le Song,et al. CA-SVM: Communication-Avoiding Support Vector Machines on Distributed Systems , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.
[4] Ran El-Yaniv,et al. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..
[5] Tao Wang,et al. Deep learning with COTS HPC systems , 2013, ICML.
[6] Henk Corporaal,et al. Efficiency Optimization of Trainable Feature Extractors for a Consumer Platform , 2011, ACIVS.
[7] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[8] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[9] Yoshua Bengio,et al. Training deep neural networks with low precision multiplications , 2014 .
[10] Alexander J. Smola,et al. Scaling Distributed Machine Learning with the Parameter Server , 2014, OSDI.
[11] Dong Yu,et al. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs , 2014, INTERSPEECH.
[12] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[13] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[14] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .
[15] Samy Bengio,et al. Revisiting Distributed Synchronous SGD , 2016, ArXiv.
[16] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Yi Yang,et al. Optimizing Memory Efficiency for Deep Convolutional Neural Networks on GPUs , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.
[19] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[20] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[21] Alexander J. Smola,et al. Efficient mini-batch training for stochastic optimization , 2014, KDD.
[22] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[23] Yann LeCun,et al. Deep learning with Elastic Averaging SGD , 2014, NIPS.
[24] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[25] Marc'Aurelio Ranzato,et al. Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[26] Forrest N. Iandola,et al. FireCaffe: Near-Linear Acceleration of Deep Neural Network Training on Compute Clusters , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[28] James Demmel,et al. Asynchronous Parallel Greedy Coordinate Descent , 2016, NIPS.