ACC: automatic ECN tuning for high-speed datacenter networks

For the widely deployed ECN-based congestion control schemes, the marking threshold is the key to deliver high bandwidth and low latency. However, due to traffic dynamics in the high-speed production networks, it is difficult to maintain persistent performance by using the static ECN setting. To meet the operational challenge, in this paper we report the design and implementation of an automatic run-time optimization scheme, ACC, which leverages the multi-agent reinforcement learning technique to dynamically adjust the marking threshold at each switch. The proposed approach works in a distributed fashion and combines offline and online training to adapt to dynamic traffic patterns. It can be easily deployed based on the common features supported by major commodity switching chips. Both testbed experiments and large-scale simulations have shown that ACC achieves low flow completion time (FCT) for both mice flows and elephant flows at line-rate. Under heterogeneous production environments with 300 machines, compared with the well-tuned static ECN settings, ACC achieves up to 20\% improvement on IOPS and 30\% lower FCT for storage service. ACC has been applied in high-speed datacenter networks and significantly simplifies the network operations.

[1]  George Varghese,et al.  High Speed Networks Need Proactive Congestion Control , 2015, HotNets.

[2]  H. Jonathan Chao,et al.  Classic Meets Modern: a Pragmatic Learning-Based Congestion Control for the Internet , 2020, SIGCOMM.

[3]  Gautam Kumar,et al.  Swift: Delay is Simple and Effective for Congestion Control in the Datacenter , 2020, SIGCOMM.

[4]  Scott Shenker,et al.  Revisiting network support for RDMA , 2018, SIGCOMM.

[5]  Haibo Chen,et al.  Deconstructing RDMA-enabled Distributed Transactions: Hybrid is Better! , 2018, OSDI.

[6]  Mo Dong,et al.  PCC Vivace: Online-Learning Congestion Control , 2018, NSDI.

[7]  Antony I. T. Rowstron,et al.  Better never than late: meeting deadlines in datacenter networks , 2011, SIGCOMM.

[8]  Fengyuan Ren,et al.  ECN Marking With Micro-Burst Traffic: Problem, Analysis, and Improvement , 2018, IEEE/ACM Transactions on Networking.

[9]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[10]  Haitao Wu,et al.  Enabling ECN over Generic Packet Scheduling , 2016, CoNEXT.

[11]  Tao Li,et al.  Octopus: an RDMA-enabled Distributed Persistent Memory File System , 2017, USENIX ATC.

[12]  Zheng Zhang,et al.  MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.

[13]  Philip Levis,et al.  Pantheon: the training ground for Internet congestion-control research , 2018, USENIX Annual Technical Conference.

[14]  Wencong Xiao,et al.  GraM: scaling graph computation to the trillions , 2015, SoCC.

[15]  Dhabaleswar K. Panda,et al.  Accelerating Spark with RDMA for Big Data Processing: Early Experiences , 2014, 2014 IEEE 22nd Annual Symposium on High-Performance Interconnects.

[16]  Chen Tian,et al.  When Cloud Storage Meets RDMA , 2021, NSDI.

[17]  QUTdN QeO,et al.  Random early detection gateways for congestion avoidance , 1993, TNET.

[18]  Arvind Krishnamurthy,et al.  High-resolution measurement of data center microbursts , 2017, Internet Measurement Conference.

[19]  Brighten Godfrey,et al.  Finishing flows quickly with preemptive scheduling , 2012, CCRV.

[20]  Jack J. Dongarra,et al.  LINPACK Benchmark , 2011, Encyclopedia of Parallel Computing.

[21]  Nick McKeown,et al.  pFabric: minimal near-optimal datacenter transport , 2013, SIGCOMM.

[22]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[23]  Gustavo Alonso,et al.  Rack-Scale In-Memory Join Processing using RDMA , 2015, SIGMOD Conference.

[24]  Adel Javanmard,et al.  Analysis of DCTCP: stability, convergence, and fairness , 2011, SIGMETRICS '11.

[25]  David L. Black,et al.  The Addition of Explicit Congestion Notification (ECN) to IP , 2001, RFC.

[26]  Hari Balakrishnan,et al.  An experimental study of the learnability of congestion control , 2014, SIGCOMM.

[27]  Wenzhong Li,et al.  Toward Effective and Fair RDMA Resource Sharing , 2018, APNet '18.

[28]  Vishal Misra,et al.  ECN or Delay: Lessons Learnt from Analysis of DCQCN and TIMELY , 2016, CoNEXT.

[29]  Li Zhang,et al.  HydraDB: a resilient RDMA-driven key-value middleware for in-memory cluster computing , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[30]  Dhabaleswar K. Panda,et al.  High-performance design of apache spark with RDMA and its benefits on various workloads , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[31]  Feng Liu,et al.  AuTO: scaling deep reinforcement learning for datacenter-scale automatic traffic optimization , 2018, SIGCOMM.

[32]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[33]  John K. Ousterhout,et al.  Homa: a receiver-driven low-latency transport protocol using network priorities , 2018, SIGCOMM.

[34]  Dhabaleswar K. Panda,et al.  High-Performance Design of Hadoop RPC with RDMA over InfiniBand , 2013, 2013 42nd International Conference on Parallel Processing.

[35]  Albert G. Greenberg,et al.  VL2: a scalable and flexible data center network , 2009, SIGCOMM '09.

[36]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[37]  R. Srikant,et al.  Analysis and design of an adaptive virtual queue (AVQ) algorithm for active queue management , 2001, SIGCOMM '01.

[38]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[39]  Brighten Godfrey,et al.  A Deep Reinforcement Learning Perspective on Internet Congestion Control , 2019, ICML.

[40]  Junxue Zhang,et al.  Enabling ECN for Datacenter Networks With RTT Variations , 2019, IEEE Transactions on Cloud Computing.

[41]  Haitao Wu,et al.  Tuning ECN for data center networks , 2012, CoNEXT '12.

[42]  Ming Zhang,et al.  Congestion Control for Large-Scale RDMA Deployments , 2015, Comput. Commun. Rev..

[43]  Kang G. Shin,et al.  Performance Isolation Anomalies in RDMA , 2017, KBNets@SIGCOMM.

[44]  Minlan Yu,et al.  HPCC: high precision congestion control , 2019, SIGCOMM.

[45]  Albert G. Greenberg,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM '10.

[46]  Donald F. Towsley,et al.  A self-tuning structure for adaptation in TCP/AQM networks , 2003, SIGMETRICS '03.

[47]  Amin Vahdat,et al.  TIMELY: RTT-based Congestion Control for the Datacenter , 2015, Comput. Commun. Rev..

[48]  Hari Balakrishnan,et al.  TCP ex machina: computer-generated congestion control , 2013, SIGCOMM.

[49]  Gautam Kumar,et al.  pHost: distributed near-optimal datacenter transport over commodity network fabric , 2015, CoNEXT.