MaxPass: Credit-based multipath transmission for load balancing in data centers

Various applications require the data center networks to carry their traffic efficiently. The data center networks usually have a hierarchical topology and exhibit distinct traffic patterns, which is different from the traditional Internet. These features have driven the data center networks to reduce the flow completion time (FCT) and to achieve high throughput. One of the possible solutions is balancing network loads across multiple paths by leveraging transport mechanisms like equal-cost multipath (ECMP) routing. ECMP allows flows to exploit multiple paths by hashing the metadata of the flows. However, due to the random nature of hash functions, ECMP often distributes the traffic unevenly, which makes it hard to utilize the links' full capacity. Thus, we propose an adaptive load balancing mechanism for multiple paths in data centers, dubbed MaxPass, to complement ECMP. A sender adaptively selects and dynamically changes multiple paths depending on the current network status like congestion. To monitor the network status, the corresponding receiver transmits a probe packet periodically to the sender; its loss indicates a traffic congestion. We carry out the quantitative analysis on the ns-2 simulator to show that MaxPass can improve the FCT and the throughput.

[1]  Yi Sun,et al.  Freeway: Adaptively Isolating the Elephant and Mice Flows on Different Transmission Paths , 2014, 2014 IEEE 22nd International Conference on Network Protocols.

[2]  Trevor Blackwell,et al.  Credit-based flow control for ATM networks: credit update protocol, adaptive credit allocation and statistical multiplexing , 1994, SIGCOMM 1994.

[3]  Charles Clos,et al.  A study of non-blocking switching networks , 1953 .

[4]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[5]  Jiao Zhang,et al.  FDALB: Flow distribution aware load balancing for datacenter networks , 2016, 2016 IEEE/ACM 24th International Symposium on Quality of Service (IWQoS).

[6]  Ming Zhang,et al.  Congestion Control for Large-Scale RDMA Deployments , 2015, Comput. Commun. Rev..

[7]  Xin Wang,et al.  FMTCP: A Fountain Code-Based Multipath Transmission Control Protocol , 2015, IEEE/ACM Transactions on Networking.

[8]  Christian E. Hopps,et al.  Analysis of an Equal-Cost Multi-Path Algorithm , 2000, RFC.

[9]  Albert G. Greenberg,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM '10.

[10]  Albert G. Greenberg,et al.  VL2: a scalable and flexible data center network , 2009, SIGCOMM '09.

[11]  Wei Bai,et al.  Information-Agnostic Flow Scheduling for Commodity Data Centers , 2015, NSDI.

[12]  H. T. Kung,et al.  Credit-based flow control for ATM networks: credit update protocol, adaptive credit allocation and statistical multiplexing , 1994, SIGCOMM.

[13]  Mark Handley,et al.  Improving datacenter performance and robustness with multipath TCP , 2011, SIGCOMM.

[14]  Mark Handley,et al.  TCP Extensions for Multipath Operation with Multiple Addresses , 2020, RFC.

[15]  F. Richard Yu,et al.  Load Balancing in Data Center Networks: A Survey , 2018, IEEE Communications Surveys & Tutorials.

[16]  Morteza Kheirkhah,et al.  AMP: A Better Multipath TCP for Data Center Networks , 2017, ArXiv.

[17]  Jennifer Rexford,et al.  CLOVE: How I learned to stop worrying about the core and love the edge , 2016, HotNets.

[18]  Hong Zhang,et al.  Resilient Datacenter Load Balancing in the Wild , 2017, SIGCOMM.

[19]  Feng Liu,et al.  AuTO: scaling deep reinforcement learning for datacenter-scale automatic traffic optimization , 2018, SIGCOMM.

[20]  Dongsu Han,et al.  Credit-Scheduled Delay-Bounded Congestion Control for Datacenters , 2017, SIGCOMM.

[21]  Amin Vahdat,et al.  TIMELY: RTT-based Congestion Control for the Datacenter , 2015, Comput. Commun. Rev..

[22]  Haitao Wu,et al.  Enabling ECN in Multi-Service Multi-Queue Data Centers , 2016, NSDI.

[23]  George Varghese,et al.  CONGA: distributed congestion-aware load balancing for datacenters , 2015, SIGCOMM.

[24]  Amin Vahdat,et al.  Hedera: Dynamic Flow Scheduling for Data Center Networks , 2010, NSDI.

[25]  Ming Zhang,et al.  Understanding data center traffic characteristics , 2010, CCRV.

[26]  Praveen Yalagandula,et al.  Mahout: Low-overhead datacenter traffic management using end-host-based elephant detection , 2011, 2011 Proceedings IEEE INFOCOM.

[27]  Amin Vahdat,et al.  TIMELY: RTT-based Congestion Control for the Datacenter , 2015, Comput. Commun. Rev..

[28]  Y. Nishida,et al.  Multipath Congestion Control for Shared Bottleneck , 2009 .

[29]  Ramana Rao Kompella,et al.  On the impact of packet spraying in data center networks , 2013, 2013 Proceedings IEEE INFOCOM.

[30]  Yasir Saleem,et al.  Network Simulator NS-2 , 2015 .

[31]  Devavrat Shah,et al.  Fastpass , 2014, SIGCOMM.

[32]  Enhong Chen,et al.  Multi-Path Transport for RDMA in Datacenters , 2018, NSDI.

[33]  Alex C. Snoeren,et al.  Inside the Social Network's (Datacenter) Network , 2015, Comput. Commun. Rev..

[34]  Hui Ding,et al.  TAO: Facebook's Distributed Data Store for the Social Graph , 2013, USENIX Annual Technical Conference.