Minimizing flow completion times in data centers

For provisioning large-scale online applications such as web search, social networks and advertisement systems, data centers face extreme challenges in providing low latency for short flows (that result from end-user actions) and high throughput for background flows (that are needed to maintain data consistency and structure across massively distributed systems). We propose L2DCT, a practical data center transport protocol that targets a reduction in flow completion times for short flows by approximating the Least Attained Service (LAS) scheduling discipline, without requiring any changes in application software or router hardware, and without adversely affecting the long flows. L2DCT can co-exist with TCP and works by adapting flow rates to the extent of network congestion inferred via Explicit Congestion Notification (ECN) marking, a feature widely supported by the installed router base. Though L2DCT is deadline unaware, our results indicate that, for typical data center traffic patterns and deadlines and over a wide range of traffic load, its deadline miss rate is consistently smaller compared to existing deadline-driven data center transport protocols. L2DCT reduces the mean flow completion time by up to 50% over DCTCP and by up to 95% over TCP. In addition, it reduces the completion for 99th percentile flows by 37% over DCTCP. We present the design and analysis of L2DCT, evaluate its performance, and discuss an implementation built upon standard Linux protocol stack.

[1]  David L. Black,et al.  The Addition of Explicit Congestion Notification (ECN) to IP , 2001, RFC.

[2]  T. N. Vijaykumar,et al.  Deadline-aware datacenter tcp (D2TCP) , 2012, CCRV.

[3]  Dennis Abts,et al.  A Guided Tour of Datacenter Networking , 2012 .

[4]  Qian Zhang,et al.  A Compound TCP Approach for High-Speed and Long Distance Networks , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[5]  Mor Harchol-Balter,et al.  Analysis of SRPT scheduling: investigating unfairness , 2001, SIGMETRICS '01.

[6]  David A. Maltz,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM 2010.

[7]  Lachlan L. H. Andrew,et al.  Congestion Control With Multipacket Feedback , 2012, IEEE/ACM Transactions on Networking.

[8]  Guido Appenzeller,et al.  Sizing router buffers , 2004, SIGCOMM '04.

[9]  Injong Rhee,et al.  CUBIC: a new TCP-friendly high-speed TCP variant , 2008, OPSR.

[10]  Cheng Jin,et al.  FAST TCP: Motivation, Architecture, Algorithms, Performance , 2006, IEEE/ACM Transactions on Networking.

[11]  Gustavo de Veciana,et al.  Enhancing both network and user performance for networks supporting best effort traffic , 2004, IEEE/ACM Transactions on Networking.

[12]  Cheng Jin,et al.  FAST TCP: Motivation, Architecture, Algorithms, and Performance , 2004, INFOCOM.

[13]  Amar Phanishayee,et al.  Safe and effective fine-grained TCP retransmissions for datacenter communication , 2009, SIGCOMM '09.

[14]  Haitao Wu,et al.  ICTCP: Incast Congestion Control for TCP in Data-Center Networks , 2013, IEEE/ACM Transactions on Networking.

[15]  Nick McKeown,et al.  Processor Sharing Flows in the Internet , 2005, IWQoS.

[16]  Paul Francis,et al.  SMALTA: practical and near-optimal FIB aggregation , 2011, CoNEXT '11.

[17]  Randy H. Katz,et al.  DeTail: reducing the flow completion time tail in datacenter networks , 2012, SIGCOMM '12.

[18]  Dennis Abts,et al.  A guided tour of data-center networking , 2012, Commun. ACM.

[19]  Zartash Afzal Uzmi,et al.  TaCo: Semantic Equivalence of IP Prefix Tables , 2011, 2011 Proceedings of 20th International Conference on Computer Communications and Networks (ICCCN).

[20]  Hung Tuan Tran,et al.  Improving Perceived Web Performance by Size Based Congestion Control , 2004, NETWORKING.

[21]  David A. Maltz,et al.  Network traffic characteristics of data centers in the wild , 2010, IMC '10.

[22]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[23]  Brighten Godfrey,et al.  Finishing flows quickly with preemptive scheduling , 2012, CCRV.

[24]  Mark Handley,et al.  Congestion control for high bandwidth-delay product networks , 2002, SIGCOMM.

[25]  Guillaume Urvoy-Keller,et al.  Analysis of LAS scheduling for job size distributions with high variance , 2003, SIGMETRICS '03.

[26]  Christo Wilson,et al.  Better never than late , 2011, SIGCOMM 2011.

[27]  Peter Thiemann,et al.  Offline GC: trashing reachable objects on tiny devices , 2011, SenSys.

[28]  Lakshminarayanan Subramanian,et al.  One more bit is enough , 2005, SIGCOMM '05.

[29]  Mark Handley,et al.  Improving datacenter performance and robustness with multipath TCP , 2011, SIGCOMM 2011.

[30]  Amin Vahdat,et al.  Less Is More: Trading a Little Bandwidth for Ultra-Low Latency in the Data Center , 2012, NSDI.