Tuning ECN for data center networks

There have been some serious concerns about the TCP performance in data center networks, including the long completion time of short TCP flows in competition with long TCP flows, and the congestion due to TCP incast. In this paper, we show that a properly tuned instant queue length based Explicit Congestion Notification (ECN) at the intermediate switches can alleviate both problems. Compared with previous work, our approach is appealing as it can be supported on current commodity switches with a simple parameter setting and it does not need any modification on ECN protocol at the end servers. Furthermore, we have observed a dilemma in which a higher ECN threshold leads to higher throughput for long flows whereas a lower threshold leads to more senders on incast under buffer pressure. We address this problem with a switch modification only scheme - dequeue marking, for further tuning the instant queue length based ECN to achieve optimal incast performance and long flow throughput with a single threshold value. Our experimental study demonstrates that dequeue marking is effective for increasing the maximum incast senders close to the performance limit of ECN, achieving a gain anywhere from 16% to 140%.

[1]  GhemawatSanjay,et al.  The Google file system , 2003 .

[2]  Donald F. Towsley,et al.  A control theoretic analysis of RED , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[3]  Haitao Wu,et al.  ICTCP: Incast Congestion Control for TCP in Data-Center Networks , 2010, IEEE/ACM Transactions on Networking.

[4]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[5]  Aleksandar Kuzmanovic,et al.  The power of explicit congestion notification , 2005, SIGCOMM '05.

[6]  Haitao Wu,et al.  BCube: a high performance, server-centric network architecture for modular data centers , 2009, SIGCOMM '09.

[7]  Broadcom Smart-Buffer Technology in Data Center Switches for Cost-Effective Performance Scaling of Cloud Applications , 2012 .

[8]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[9]  Amar Phanishayee,et al.  Safe and effective fine-grained TCP retransmissions for datacenter communication , 2009, SIGCOMM '09.

[10]  Guido Appenzeller,et al.  Sizing router buffers , 2004, SIGCOMM '04.

[11]  T. V. Lakshman,et al.  The drop from front strategy in TCP and in TCP over ATM , 1996, Proceedings of IEEE INFOCOM '96. Conference on Computer Communications.

[12]  Srinivasan Seshan,et al.  Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems , 2008, FAST.

[13]  Adel Javanmard,et al.  Analysis of DCTCP: stability, convergence, and fairness , 2011, PERV.

[14]  Sally Floyd,et al.  Adaptive RED: An Algorithm for Increasing the Robustness of RED's Active Queue Management , 2001 .

[15]  QUTdN QeO,et al.  Random early detection gateways for congestion avoidance , 1993, TNET.

[16]  Albert G. Greenberg,et al.  The nature of data center traffic: measurements & analysis , 2009, IMC '09.

[17]  Tao Yang,et al.  The Panasas ActiveScale Storage Cluster - Delivering Scalable High Bandwidth Storage , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[18]  K. K. Ramakrishnan,et al.  A binary feedback scheme for congestion avoidance in computer networks with a connectionless network layer , 1988, SIGCOMM '88.

[19]  David A. Maltz,et al.  DCTCP: Efficient Packet Transport for the Commoditized Data Center , 2010 .

[20]  David L. Black,et al.  The Addition of Explicit Congestion Notification (ECN) to IP , 2001, RFC.

[21]  T. V. Lakshman,et al.  The performance of TCP/IP for networks with high bandwidth-delay products and random loss , 1997, TNET.

[22]  Lei Shi,et al.  Dcell: a scalable and fault-tolerant network structure for data centers , 2008, SIGCOMM '08.

[23]  Donald F. Towsley,et al.  On designing improved controllers for AQM routers supporting TCP flows , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).