On mitigating TCP Incast in Data Center Networks

TCP Incast, also known as TCP throughput collapse, is a term used to describe a link capacity under-utilization phenomenon in certain many-to-one communication patterns, typically in many datacenter applications. The main root cause of TCP Incast analyzed by prior works is attributed to packet drops at the congestion switch that result in TCP timeout. Congestion control algorithms have been developed to reduce or eliminate packet drops at the congestion switch. In this paper, the performance of Quantized Congestion Notification (QCN) with respect to the TCP incast problem during data access from clustered servers in datacenters are investigated. QCN can effectively control link rates very rapidly in a datacenter environment. However, it performs poorly when TCP Incast is observed. To explain this low link utilization, we examine the rate fluctuation of different flows within one synchronous reading request, and find that the poor performance of TCP throughput with QCN is due to the rate unfairness of different flows. Therefore, an enhanced QCN congestion control algorithm, called fair Quantized Congestion Notification (FQCN), is proposed to improve fairness of multiple flows sharing one bottleneck link. We evaluate the performance of FQCN as compared to that of QCN in terms of fairness and convergence with four simultaneous and eight staggered source flows. As compared to QCN, fairness is improved greatly and the queue length at the bottleneck link converges to the equilibrium queue length very fast. The effects of FQCN to TCP throughput collapse are also investigated. Simulation results show that FQCN significantly enhances TCP throughput performance in a TCP Incast setup.

[1]  Srinivasan Seshan,et al.  Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems , 2008, FAST.

[2]  Raj Jain,et al.  An Explicit Rate Control Framework for Lossless Ethernet Operation , 2008, 2008 IEEE International Conference on Communications.

[3]  Raj Jain,et al.  Analysis of Backward Congestion Notification (BCN) for Ethernet In Datacenter Applications , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[4]  Tao Yang,et al.  The Panasas ActiveScale Storage Cluster - Delivering Scalable High Bandwidth Storage , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[5]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[6]  Amar Phanishayee,et al.  Safe and effective fine-grained TCP retransmissions for datacenter communication , 2009, SIGCOMM '09.

[7]  Chakchai So-In,et al.  Enhanced Forward Explicit Congestion Notification (E-FECN) scheme for datacenter Ethernet networks , 2008, 2008 International Symposium on Performance Evaluation of Computer and Telecommunication Systems.

[8]  Rong Pan,et al.  Data center transport mechanisms: Congestion control theory and IEEE standardization , 2008, 2008 46th Annual Allerton Conference on Communication, Control, and Computing.

[9]  Junda Liu,et al.  Multi-enterprise networking , 2000 .