暂无分享,去创建一个
[1] Joelle Pineau,et al. Conditional Computation in Neural Networks for faster models , 2015, ArXiv.
[2] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[3] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[4] William Chan,et al. Distributed asynchronous optimization of convolutional neural networks , 2014, INTERSPEECH.
[5] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[6] Ji Liu,et al. Staleness-Aware Async-SGD for Distributed Deep Learning , 2015, IJCAI.
[7] Yijun Huang,et al. Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization , 2015, NIPS.
[8] Andrew Lavin,et al. Fast Algorithms for Convolutional Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Dong Yu,et al. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs , 2014, INTERSPEECH.
[10] Suyog Gupta,et al. Model Accuracy and Runtime Tradeoff in Distributed Deep Learning: A Systematic Study , 2015, 2016 IEEE 16th International Conference on Data Mining (ICDM).
[11] Trishul M. Chilimbi,et al. Project Adam: Building an Efficient and Scalable Deep Learning Training System , 2014, OSDI.
[12] Shengen Yan,et al. Deep Image: Scaling up Image Recognition , 2015, ArXiv.