AG: Adaptive Switching Granularity for Load Balancing with Asymmetric Topology in Data Center Network

Modern data center topologies often take the form of a multi-rooted tree with rich parallel paths to provide high bandwidth. However, various path diversities caused by traffic dynamics, link failures and heterogeneous switching equipments widely exist in production datacenter network. Therefore, the multi-path load balancer in data center should be robust to these diversities. Although prior fine-grained schemes such as RPS and Presto make full use of available paths, they are prone to experience packet reordering problem under asymmetric topology. The coarse-grained solutions such as ECMP and LetFlow effectively avoid packet reordering, but easily lead to under-utilization of multiple paths. To cope with these inefficiencies, we propose a load balancing mechanism called AG, which adaptively adjusts switching granularity according to the asymmetric degree of multiple paths. AG increases switching granularity to alleviate packet reordering under large degrees of topology asymmetry, while reducing switching granularity to obtain high link utilization under small degrees of topology asymmetry. AG is deployed on the switches with negligible overhead, while making no modification on end-hosts. We evaluate AG through both Mininet testbed and large-scale NS2 simulations. The experimental results show that AG reduces the average and 99th flow completion time by up to 51% and 56% over the state-of-the-art load balancing schemes, respectively.

[1]  Ming Zhang,et al.  MicroTE: fine grained traffic engineering for data centers , 2011, CoNEXT '11.

[2]  Rong Pan,et al.  Let It Flow: Resilient Asymmetric Load Balancing with Flowlet Switching , 2017, NSDI.

[3]  Nick McKeown,et al.  Reproducible network experiments using container-based emulation , 2012, CoNEXT '12.

[4]  Yi Sun,et al.  Adaptive Path Isolation for Elephant and Mice Flows by Exploiting Path Diversity in Datacenters , 2016, IEEE Transactions on Network and Service Management.

[5]  Hong Zhang,et al.  Resilient Datacenter Load Balancing in the Wild , 2017, SIGCOMM.

[6]  Brighten Godfrey,et al.  DRILL: Micro Load Balancing for Low-latency Data Center Networks , 2017, SIGCOMM.

[7]  Wenjun Lv,et al.  QDAPS: Queueing Delay Aware Packet Spraying for Load Balancing in Data Center , 2018, 2018 IEEE 26th International Conference on Network Protocols (ICNP).

[8]  George Varghese,et al.  P4: programming protocol-independent packet processors , 2013, CCRV.

[9]  Brighten Godfrey,et al.  VeriFlow: verifying network-wide invariants in real time , 2012, HotSDN '12.

[10]  Minlan Yu,et al.  SilkRoad: Making Stateful Layer-4 Load Balancing Fast and Cheap Using Switching ASICs , 2017, SIGCOMM.

[11]  George Varghese,et al.  CONGA: distributed congestion-aware load balancing for datacenters , 2015, SIGCOMM.

[12]  Kang Lee,et al.  IEEE 1588 standard for a precision clock synchronization protocol for networked measurement and control systems , 2002, 2nd ISA/IEEE Sensors for Industry Conference,.

[13]  Keqiang He,et al.  Presto: Edge-based Load Balancing for Fast Datacenter Networks , 2015, SIGCOMM.

[14]  Albert G. Greenberg,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM '10.

[15]  Wenjun Lv,et al.  CAPS: Coding-based Adaptive Packet Spraying to Reduce Flow Completion Time in Data Center , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[16]  Peng Wang,et al.  Luopan: Sampling based load balancing in data center networks , 2016, 2016 IEEE 24th International Conference on Network Protocols (ICNP).

[17]  D. Zats,et al.  DeTail: reducing the flow completion time tail in datacenter networks , 2012, CCRV.

[18]  Mohammad Alizadeh,et al.  On the Data Path Performance of Leaf-Spine Datacenter Fabrics , 2013, 2013 IEEE 21st Annual Symposium on High-Performance Interconnects.

[19]  Jennifer Rexford,et al.  Clove: Congestion-Aware Load Balancing at the Virtual Edge , 2017, CoNEXT.

[20]  Srikanth Kandula,et al.  Dynamic load balancing without packet reordering , 2007, CCRV.

[21]  Albert G. Greenberg,et al.  VL2: a scalable and flexible data center network , 2009, SIGCOMM '09.

[22]  Jennifer Rexford,et al.  HULA: Scalable Load Balancing Using Programmable Data Planes , 2016, SOSR.

[23]  Ramana Rao Kompella,et al.  On the impact of packet spraying in data center networks , 2013, 2013 Proceedings IEEE INFOCOM.

[24]  Baochun Li,et al.  RepFlow: Minimizing flow completion times with replicated flows in data centers , 2013, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[25]  Amin Vahdat,et al.  Hedera: Dynamic Flow Scheduling for Data Center Networks , 2010, NSDI.

[26]  Fang Wang,et al.  ALB: Adaptive Load Balancing Based on Accurate Congestion Feedback for Asymmetric Topologies , 2018, 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS).

[27]  Min Zhu,et al.  WCMP: weighted cost multipathing for improved fairness in data centers , 2014, EuroSys '14.

[28]  Xin Jin,et al.  SketchVisor: Robust Network Measurement for Software Packet Processing , 2017, SIGCOMM.