Improving TCP performance in data center networks with Adaptive Complementary Coding

TCP suffers from low throughput and high latency because of its expensive timeout based loss recovery mechanism in data center networks (DCNs). In this paper, we propose TCP with Adaptive Complementary Coding (TCP-ACC) to effectively address these problems. Without revising existing TCP congestion control, we first design a light-weight complementary coding scheme to avoid TCP timeout which will result in higher throughput and lower latency. In our scheme, the redundancy setting is adaptive to the real-time packet loss rate. Then we introduce Lyapunov optimization framework to find the optimal number of redundant coding packets for TCP-ACC, and we also prove that TCP-ACC can reduce the flow timeout probability close to that of the optimal complementary coding solution. Extensive NS2 simulations show that, compared with other three solutions for TCP's problems in DCNs, TCP-ACC can reduce the flow completion time by 45% and improve the flow throughput by 40% on average.

[1]  Mark Handley,et al.  Is it still possible to extend TCP? , 2011, IMC '11.

[2]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[3]  Christo Wilson,et al.  Better never than late , 2011, SIGCOMM 2011.

[4]  David A. Maltz,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM 2010.

[5]  Amar Phanishayee,et al.  Safe and effective fine-grained TCP retransmissions for datacenter communication , 2009, SIGCOMM '09.

[6]  Ion Stoica,et al.  Efficient coflow scheduling with Varys , 2015, SIGCOMM.

[7]  Haitao Wu,et al.  ICTCP: Incast Congestion Control for TCP in Data-Center Networks , 2013, IEEE/ACM Transactions on Networking.

[8]  Li Tang,et al.  Taming TCP incast throughput collapse in data center networks , 2013, 2013 21st IEEE International Conference on Network Protocols (ICNP).

[9]  Carey L. Williamson,et al.  Solving the TCP-Incast Problem with Application-Level Scheduling , 2012, 2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[10]  Carey L. Williamson,et al.  An Analytic Throughput Model for TCP NewReno , 2010, IEEE/ACM Transactions on Networking.

[11]  H. Jonathan Chao,et al.  Preventing TCP incast throughput collapse at the initiation, continuation, and termination , 2012, 2012 IEEE 20th International Workshop on Quality of Service.

[12]  Walter Willinger,et al.  Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference , 2011, IMC 2011.

[13]  Himanshu Shah,et al.  FireFly , 2014, SIGCOMM.

[14]  Brighten Godfrey,et al.  More is less: reducing latency via redundancy , 2012, HotNets-XI.

[15]  Devavrat Shah,et al.  Network Coding Meets TCP: Theory and Implementation , 2011, Proceedings of the IEEE.

[16]  Ramesh Govindan,et al.  Reducing web latency: the virtue of gentle aggression , 2013, SIGCOMM.

[17]  Baochun Li,et al.  RepFlow: Minimizing flow completion times with replicated flows in data centers , 2013, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.