DeTail: reducing the flow completion time tail in datacenter networks

Web applications have now become so sophisticated that rendering a typical page may require hundreds of intra-datacenter flows. At the same time, web sites must meet strict page creation deadlines of 200-300ms to satisfy user demands for interactivity. Long-tailed flow completion times make it challenging for web sites to meet these constraints. They are forced to choose between rendering a subset of the complex page, or delay its rendering, thus missing deadlines and sacrificing either quality or responsiveness. Either option leads to potential financial loss. In this paper, we present a new cross-layer network stack aimed at reducing the long tail of flow completion times. The approach exploits cross-layer information to reduce packet drops, prioritize latency-sensitive flows, and evenly distribute network load, effectively reducing the long tail of flow completion times. We evaluate our approach through NS-3 based simulation and Click-based implementation demonstrating our ability to consistently reduce the tail across a wide range of workloads. We often achieve reductions of over 50% in 99.9th percentile flow completion times.

[1]  Jerome H. Saltzer,et al.  End-to-end arguments in system design , 1984, TOCS.

[2]  Van Jacobson,et al.  TCP extensions for long-delay paths , 1988, RFC.

[3]  V. Rich Personal communication , 1989, Nature.

[4]  Scott Shenker,et al.  Analysis and simulation of a fair queueing algorithm , 1989, SIGCOMM '89.

[5]  QUTdN QeO,et al.  Random early detection gateways for congestion avoidance , 1993, TNET.

[6]  L. Peterson,et al.  TCP Vegas: new techniques for congestion detection and avoidance , 1994, SIGCOMM.

[7]  David D. Clark,et al.  The design philosophy of the DARPA internet protocols , 1988, SIGCOMM '88.

[8]  Nick McKeown Fast Switched Backplane for a Gigabit Switched Router , 1997 .

[9]  Nick McKeown,et al.  The iSLIP scheduling algorithm for input-queued switches , 1999, TNET.

[10]  Eddie Kohler,et al.  The Click modular router , 1999, SOSP.

[11]  Sally Floyd,et al.  The NewReno Modification to TCP's Fast Recovery Algorithm , 2004, RFC.

[12]  Dongho Kim,et al.  Experience with DETER: a testbed for security research , 2006, 2nd International Conference on Testbeds and Research Infrastructures for the Development of Networks and Communities, 2006. TRIDENTCOM 2006..

[13]  Ron Kohavi,et al.  Online Experiments: Lessons Learned , 2007, Computer.

[14]  Amith R. Mamidala,et al.  Hot-Spot Avoidance With Multi-Pathing Over InfiniBand: An MPI Perspective , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[15]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[16]  Lei Shi,et al.  Dcell: a scalable and fault-tolerant network structure for data centers , 2008, SIGCOMM '08.

[17]  Albert G. Greenberg,et al.  VL2: a scalable and flexible data center network , 2009, SIGCOMM '09.

[18]  Junda Liu,et al.  Multi-enterprise networking , 2000 .

[19]  Amar Phanishayee,et al.  Safe and effective fine-grained TCP retransmissions for datacenter communication , 2009, SIGCOMM '09.

[20]  Haitao Wu,et al.  BCube: a high performance, server-centric network architecture for modular data centers , 2009, SIGCOMM '09.

[21]  Amin Vahdat,et al.  Hedera: Dynamic Flow Scheduling for Data Center Networks , 2010, NSDI.

[22]  Parag Agrawal,et al.  The case for RAMClouds: scalable high-performance storage entirely in DRAM , 2010, OPSR.

[23]  Mark Handley,et al.  Data center networking with multipath TCP , 2010, Hotnets-IX.

[24]  Albert G. Greenberg,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM '10.

[25]  Antony I. T. Rowstron,et al.  Better never than late: meeting deadlines in datacenter networks , 2011, SIGCOMM.

[26]  Mark Handley,et al.  Improving datacenter performance and robustness with multipath TCP , 2011, SIGCOMM.

[27]  Albert G. Greenberg,et al.  Sharing the Data Center Network , 2011, NSDI.

[28]  Haitao Wu,et al.  ServerSwitch: A Programmable and High Performance Platform for Data Center Networks , 2011, NSDI.

[29]  John Kim,et al.  High Performance Datacenter Networks: Architectures, Algorithms, and Opportunities , 2011, High Performance Datacenter Networks: Architectures, Algorithms, and Opportunities.

[30]  Amin Vahdat,et al.  Less Is More: Trading a Little Bandwidth for Ultra-Low Latency in the Data Center , 2012, NSDI.