FastLane: making short flows shorter with agile drop notification

The drive towards richer and more interactive web content places increasingly stringent requirements on datacenter network performance. Applications running atop these networks typically partition an incoming query into multiple subqueries, and generate the final result by aggregating the responses for these subqueries. As a result, a large fraction --- as high as 80% --- of the network flows in such workloads are short and latency-sensitive. The speed with which existing networks respond to packet drops limits their ability to meet high-percentile flow completion time SLOs. Indirect notifications indicating packet drops (e.g., duplicates in an end-to-end acknowledgement sequence) are an important limitation to the agility of response to packet drops. This paper proposes FastLane, an in-network drop notification mechanism. FastLane enhances switches to send high-priority drop notifications to sources, thus informing sources as quickly as possible. Consequently, sources can retransmit packets sooner and throttle transmission rates earlier, thus reducing high-percentile flow completion times. We demonstrate, through simulation and implementation, that FastLane reduces 99.9th percentile completion times of short flows by up to 81%. These benefits come at minimal cost --- safeguards ensure that FastLane consume no more than 1% of bandwidth and 2.5% of buffers.

[1]  Fred Baker,et al.  Requirements for IP Version 4 Routers , 1995, RFC.

[2]  Sally Floyd,et al.  The NewReno Modification to TCP's Fast Recovery Algorithm , 2004, RFC.

[3]  EDDIE KOHLER,et al.  The click modular router , 2000, TOCS.

[4]  Donald Kennedy,et al.  Better Never Than Late , 2005, Science.

[5]  Ron Kohavi,et al.  Online Experiments: Lessons Learned , 2007, Computer.

[6]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[7]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[8]  Amar Phanishayee,et al.  Safe and effective fine-grained TCP retransmissions for datacenter communication , 2009, SIGCOMM '09.

[9]  David A. Maltz,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM 2010.

[10]  David A. Maltz,et al.  Network traffic characteristics of data centers in the wild , 2010, IMC '10.

[11]  Andreea Anghel,et al.  Short and Fat: TCP Performance in CEE Datacenter Networks , 2011, 2011 IEEE 19th Annual Symposium on High Performance Interconnects.

[12]  Antony I. T. Rowstron,et al.  Better never than late: meeting deadlines in datacenter networks , 2011, SIGCOMM.

[13]  VL2: a scalable and flexible data center network , 2011, Commun. ACM.

[14]  Mark Handley,et al.  Improving datacenter performance and robustness with multipath TCP , 2011, SIGCOMM.

[15]  Yuchung Cheng,et al.  TCP fast open , 2011, CoNEXT '11.

[16]  John Kim,et al.  High Performance Datacenter Networks: Architectures, Algorithms, and Opportunities , 2011, High Performance Datacenter Networks: Architectures, Algorithms, and Opportunities.

[17]  D. Zats,et al.  DeTail: reducing the flow completion time tail in datacenter networks , 2012, CCRV.

[18]  Amin Vahdat,et al.  Less Is More: Trading a Little Bandwidth for Ultra-Low Latency in the Data Center , 2012, NSDI.

[19]  Brighten Godfrey,et al.  Finishing flows quickly with preemptive scheduling , 2012, CCRV.

[20]  Van Jacobson,et al.  Controlling queue delay , 2012, Commun. ACM.

[21]  Fernando Gont,et al.  Deprecation of ICMP Source Quench Messages , 2012, RFC.

[22]  Amin Vahdat,et al.  Chronos: predictable low latency for data center applications , 2012, SoCC '12.

[23]  T. N. Vijaykumar,et al.  Deadline-aware datacenter tcp (D2TCP) , 2012, CCRV.

[24]  Nick McKeown,et al.  pFabric: minimal near-optimal datacenter transport , 2013, SIGCOMM.

[25]  George Varghese,et al.  Forwarding metamorphosis: fast programmable match-action processing in hardware for SDN , 2013, SIGCOMM.

[26]  Luiz André Barroso,et al.  The tail at scale , 2013, CACM.

[27]  Chuang Lin,et al.  Catch the Whole Lot in an Action: Rapid Precise Packet Loss Notification in Data Center , 2014, NSDI.