PABO: A Link-Layer Congestion Mitigation Mechanism Based on Packet Bounce

In today's data center, a diverse mix of throughput-sensitive long flows and delay-sensitive short flows are commonly presented in shallow-buffered switches. Long flows could potentially block the transmission of delay-sensitive short flows, leading to degraded performance. Congestion can also be caused by the synchronization of multiple TCP connections for short flows, as typically seen in the partition/aggregate traffic pattern. While multiple end-to-end transport-layer solutions have been proposed, none of them have tackled the real challenge: reliable transmission in the network. In this paper, we fill this gap by presenting PABO -- a novel link-layer design that can mitigate congestion by temporarily bouncing packets to upstream switches. PABO's design fulfills the following goals: i) providing per-flow based flow control on the link layer, ii) handling transient congestion without the intervention of end devices, and iii) gradually back propagating the congestion signal to the source when the network is not capable to handle the congestion.Experiment results show that PABO can provide prominent advantage of mitigating transient congestions and can achieve significant gain on end-to-end delay.

[1]  David A. Maltz,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM 2010.

[2]  Adi Rosén,et al.  Scheduling policies for CIOQ switches , 2003, SPAA '03.

[3]  Randy H. Katz,et al.  The Eifel algorithm: making TCP robust against spurious retransmissions , 2000, CCRV.

[4]  Xiang Shi,et al.  PABO: Congestion mitigation via packet bounce , 2017, 2017 IEEE International Conference on Communications (ICC).

[5]  Craig Partridge,et al.  Packet reordering is not pathological network behavior , 1999, TNET.

[6]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[7]  Randy H. Katz,et al.  FastLane: making short flows shorter with agile drop notification , 2015, SoCC.

[8]  Minlan Yu,et al.  DIBS: just-in-time congestion mitigation for data centers , 2014, EuroSys '14.

[9]  Junda Liu,et al.  Multi-enterprise networking , 2000 .

[10]  Victor O. K. Li,et al.  An Overview of Packet Reordering in Transmission Control Protocol (TCP): Problems, Solutions, and Challenges , 2007, IEEE Transactions on Parallel and Distributed Systems.

[11]  Anura P. Jayasumana,et al.  Improved Packet Reordering Metrics , 2008, RFC.

[12]  Srikanth Kandula,et al.  Sampling biases in network path measurements and what to do about it , 2009, IMC '09.

[13]  QUTdN QeO,et al.  Random early detection gateways for congestion avoidance , 1993, TNET.

[14]  Haitao Wu,et al.  ICTCP: Incast Congestion Control for TCP in Data-Center Networks , 2013, IEEE/ACM Transactions on Networking.

[15]  André Wenzel,et al.  On the effects of the IEEE 802.3x flow control in full-duplex Ethernet LANs , 1999, Proceedings 24th Conference on Local Computer Networks. LCN'99.

[16]  Donald F. Towsley,et al.  On designing improved controllers for AQM routers supporting TCP flows , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[17]  Randy H. Katz,et al.  DeTail: reducing the flow completion time tail in datacenter networks , 2012, SIGCOMM '12.

[18]  Robert T. Braden,et al.  Requirements for Internet Hosts - Communication Layers , 1989, RFC.

[19]  Paramvir Bahl,et al.  Flyways To De-Congest Data Center Networks , 2009, HotNets.

[20]  Anura P. Jayasumana,et al.  On Monitoring of End-to-End Packet Reordering over the Internet , 2006, International conference on Networking and Services (ICNS'06).

[21]  Vern Paxson,et al.  End-to-end Internet packet dynamics , 1997, SIGCOMM '97.