iRDMA: Efficient Use of RDMA in Distributed Deep Learning Systems
暂无分享,去创建一个
Wei Zhang | Song Jiang | Li Zhang | Michel Hack | Xingbo Wu | Yandong Wang | Yufei Ren | Zijun Wang
[1] Clément Farabet,et al. Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.
[2] John Tran,et al. cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.
[3] Brian Tierney,et al. Protocols for wide-area data-intensive applications: Design and performance issues , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[4] Tao Wang,et al. Deep learning with COTS HPC systems , 2013, ICML.
[5] Ji Liu,et al. Staleness-Aware Async-SGD for Distributed Deep Learning , 2015, IJCAI.
[6] Dhabaleswar K. Panda,et al. High performance RDMA-based MPI implementation over InfiniBand , 2003, ICS.
[7] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[8] P. Wyckoff,et al. iSER Storage Target for Object-Based Storage Devices , 2007, Fourth International Workshop on Storage Network Architecture and Parallel I/Os (SNAPI 2007).
[9] Shudong Jin,et al. Design and performance evaluation of NUMA-aware RDMA-based end-to-end data transfer systems , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[10] Seunghak Lee,et al. More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server , 2013, NIPS.
[11] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[12] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[13] Alexander J. Smola,et al. Scaling Distributed Machine Learning with the Parameter Server , 2014, OSDI.
[14] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[15] Eric P. Xing,et al. GeePS: scalable deep learning on distributed GPUs with a GPU-specialized parameter server , 2016, EuroSys.
[16] Yann LeCun,et al. Deep learning with Elastic Averaging SGD , 2014, NIPS.
[17] Alexander J. Smola,et al. Parallelized Stochastic Gradient Descent , 2010, NIPS.
[18] Trishul M. Chilimbi,et al. Project Adam: Building an Efficient and Scalable Deep Learning Training System , 2014, OSDI.
[19] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[20] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.