Minimizing datacenter flow completion times with server-based flow scheduling

Minimizing flow completion times (FCT) is a critical issue in datacenter networks. Existing approaches either cannot minimize FCT (i.e., DCTCP) or are costly to deploy with hardware modification on switches (i.e., pFabric). This paper presents a server-based flow scheduling (SFS) scheme for enabling easy and rapid deployment in servers while almost retaining the same minimal FCT as state-of-the-art pFabric. SFS combines traffic control and flow scheduling to keep in-switch queue very short, hence switches do not need to schedule flows and such function can be moved to servers. With SFS, each server keeps its highest-priority flow active and pauses the other low-priority ones. Then flows are completed one by one for minimizing FCT. We propose two novel techniques to achieve SFS. First, the bidirectional flow scheduling technique is used for each server to locally schedule sent or received flows. Second, the most recently seen flow coordination technique is used to coordinate senders and receivers to address the priority disagreement problem. We show that SFS scales to thousands of senders in Incast scenario and achieves zero-queue in the entire networks. Experimental results on NS2 show that SFS outperforms DCTCP and pFabric in Incast scenario. On real workloads in FatTree, SFS achieves 4 ? faster than DCTCP and closely approaches pFabric in tail FCT of short flows.

[1]  Chuang Lin,et al.  Catch the Whole Lot in an Action: Rapid Precise Packet Loss Notification in Data Center , 2014, NSDI.

[2]  Haitao Wu,et al.  ICTCP: Incast Congestion Control for TCP in Data-Center Networks , 2010, IEEE/ACM Transactions on Networking.

[3]  Amin Vahdat,et al.  Less Is More: Trading a Little Bandwidth for Ultra-Low Latency in the Data Center , 2012, NSDI.

[4]  Haitao Wu,et al.  PAC: Taming TCP Incast Congestion Using Proactive ACK Control , 2014, 2014 IEEE 22nd International Conference on Network Protocols.

[5]  Tei-Wei Kuo,et al.  Efficient identification of hot data for flash memory storage systems , 2006, TOS.

[6]  Christo Wilson,et al.  Better never than late , 2011, SIGCOMM 2011.

[7]  Ion Stoica,et al.  Efficient coflow scheduling with Varys , 2015, SIGCOMM.

[8]  Devavrat Shah,et al.  Fastpass: a centralized "zero-queue" datacenter network , 2015, SIGCOMM 2015.

[9]  Ramana Rao Kompella,et al.  On the impact of packet spraying in data center networks , 2013, 2013 Proceedings IEEE INFOCOM.

[10]  V. Jacobson,et al.  Congestion avoidance and control , 1988, CCRV.

[11]  Gautam Kumar,et al.  pHost: distributed near-optimal datacenter transport over commodity network fabric , 2015, CoNEXT.

[12]  T. N. Vijaykumar,et al.  Deadline-aware datacenter tcp (D2TCP) , 2012, CCRV.

[13]  Alex X. Liu,et al.  Friends, not foes: synthesizing existing transport strategies for data center networks , 2015, SIGCOMM 2015.

[14]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[15]  Wei Bai,et al.  Information-Agnostic Flow Scheduling for Commodity Data Centers , 2015, NSDI.

[16]  Ion Stoica,et al.  Efficient Coflow Scheduling Without Prior Knowledge , 2015, SIGCOMM.

[17]  Amin Vahdat,et al.  PortLand: a scalable fault-tolerant layer 2 data center network fabric , 2009, SIGCOMM '09.

[18]  Hitesh Ballani,et al.  Decentralized task-aware scheduling for data center networks , 2015, SIGCOMM 2015.

[19]  Brighten Godfrey,et al.  Finishing flows quickly with preemptive scheduling , 2012, CCRV.

[20]  Albert G. Greenberg,et al.  VL2: a scalable and flexible data center network , 2009, SIGCOMM '09.

[21]  Andrzej Duda,et al.  Two-way TCP connections: old problem, new insight , 2011, CCRV.

[22]  Bogdan M. Wilamowski,et al.  The Transmission Control Protocol , 2005, The Industrial Information Technology Handbook.

[23]  Amar Phanishayee,et al.  Safe and effective fine-grained TCP retransmissions for datacenter communication , 2009, SIGCOMM '09.

[24]  Nick McKeown,et al.  pFabric: minimal near-optimal datacenter transport , 2013, SIGCOMM.

[25]  Albert G. Greenberg,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM '10.