Accelerating Distributed Deep Learning using Multi-Path RDMA in Data Center Networks
暂无分享,去创建一个
Wei Ye | Zhi-Li Zhang | Cheng Jin | Ziyan Wu | Feng Tian | Yang Zhang
[1] Yifei Lu,et al. SDN-based TCP congestion control in data center networks , 2015, 2015 IEEE 34th International Performance Computing and Communications Conference (IPCCC).
[2] Vishal Misra,et al. ECN or Delay: Lessons Learnt from Analysis of DCQCN and TIMELY , 2016, CoNEXT.
[3] Rong Pan,et al. Let It Flow: Resilient Asymmetric Load Balancing with Flowlet Switching , 2017, NSDI.
[4] Ming Zhang,et al. Proceedings of the General Track: 2004 USENIX Annual Technical Conference , 2022 .
[5] Wenzhong Li,et al. Toward Effective and Fair RDMA Resource Sharing , 2018, APNet '18.
[6] Yiying Zhang,et al. LITE Kernel RDMA Support for Datacenter Applications , 2017, SOSP.
[7] Miguel Castro,et al. FaRM: Fast Remote Memory , 2014, NSDI.
[8] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[9] Feilong Liu,et al. Design and Evaluation of an RDMA-aware Data Shuffling Operator for Parallel Database Systems , 2017, EuroSys.
[10] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[11] Peter Phaal,et al. InMon Corporation's sFlow: A Method for Monitoring Traffic in Switched and Routed Networks , 2001, RFC.
[12] Ruben Mayer,et al. Scalable Deep Learning on Distributed Infrastructures: Challenges, Techniques and Tools , 2019 .
[13] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.
[14] Mark Handley,et al. TCP Extensions for Multipath Operation with Multiple Addresses , 2020, RFC.
[15] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[16] Enhong Chen,et al. Multi-Path Transport for RDMA in Datacenters , 2018, NSDI.
[17] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[18] David G. Andersen,et al. FaSST: Fast, Scalable and Simple Distributed Transactions with Two-Sided (RDMA) Datagram RPCs , 2016, OSDI.
[19] Monia Ghobadi,et al. Rethinking end-to-end congestion control in software-defined networks , 2012, HotNets-XI.
[20] Haitao Wu,et al. RDMA over Commodity Ethernet at Scale , 2016, SIGCOMM.
[21] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[22] Devavrat Shah,et al. Fastpass , 2014, SIGCOMM.
[23] Mark Handley,et al. Improving datacenter performance and robustness with multipath TCP , 2011, SIGCOMM.
[24] Shudong Jin,et al. Design and performance evaluation of NUMA-aware RDMA-based end-to-end data transfer systems , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[25] Yi Wang,et al. RDMA Load Balancing via Data Partition , 2019, 2019 28th International Conference on Computer Communication and Networks (ICCCN).
[26] Ming Zhang,et al. Congestion Control for Large-Scale RDMA Deployments , 2015, Comput. Commun. Rev..
[27] Gustavo Alonso,et al. Minimizing the Hidden Cost of RDMA , 2009, 2009 29th IEEE International Conference on Distributed Computing Systems.
[28] Kang G. Shin,et al. Efficient Memory Disaggregation with Infiniswap , 2017, NSDI.
[29] Alexander Sergeev,et al. Horovod: fast and easy distributed deep learning in TensorFlow , 2018, ArXiv.
[30] Haitao Wu,et al. Per-packet load-balanced, low-latency routing for clos-based data center networks , 2013, CoNEXT.
[31] Albert G. Greenberg,et al. Data center TCP (DCTCP) , 2010, SIGCOMM '10.
[32] Richard Wang,et al. OpenFlow-Based Server Load Balancing Gone Wild , 2011, Hot-ICE.