Creek: Inter Many-to-Many Coflows Scheduling for Datacenter Networks

Datacenter networked applications, often require multiple data transfer flows that semantically constitute a coflow group. A coflow is thus considered completed when all the transfers in the coflow are completed. Hence, application performance is optimized whenever the completion time of a coflow is minimized, rather than that of the flows composing it. Currently, popular coflow scheduling algorithms are mostly centralized, and they incur high overheads. The decentralized approach in the “many-to-many” scenario also incurs high communication overheads due to the communication among the local controllers. In this paper, we present a coflow scheduling mechanism that aims to minimize the coflow completion time for coflows that show a many-to-many communication pattern, and as a byproduct communication overhead cost is also minimized. Our algorithm preserves compatibility with existing commodity switches and network protocols and improves the coflow completion times on average by 1.8 times compared to the baseline as demonstrated via testbed implementation and large-scale simulation.

[1]  Haitao Wu,et al.  Enabling Work-Conserving Bandwidth Guarantees for Multi-Tenant Datacenters via Dynamic Tenant-Queue Binding , 2017, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[2]  Alex C. Snoeren,et al.  Inside the Social Network's (Datacenter) Network , 2015, Comput. Commun. Rev..

[3]  Ion Stoica,et al.  Efficient Coflow Scheduling Without Prior Knowledge , 2015, SIGCOMM.

[4]  H. Hurley computer networking. , 1996, Ostomy/wound management.

[5]  Mung Chiang,et al.  Need for speed: CORA scheduler for optimizing completion-times in the cloud , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[6]  Zhitang Chen,et al.  Online flow size prediction for improved network routing , 2016, 2016 IEEE 24th International Conference on Network Protocols (ICNP).

[7]  Wei Wang,et al.  Fair Coflow Scheduling without Prior Knowledge , 2018, 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS).

[8]  Joseph M. Hellerstein,et al.  Distributed GraphLab: A Framework for Machine Learning in the Cloud , 2012, Proc. VLDB Endow..

[9]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[10]  Sheng Wang,et al.  Rapier: Integrating routing and scheduling for coflow-aware data center networks , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[11]  Carlos Guestrin,et al.  Distributed GraphLab : A Framework for Machine Learning and Data Mining in the Cloud , 2012 .

[12]  R. K. Shyamasundar,et al.  Introduction to algorithms , 1996 .

[13]  George Varghese,et al.  CONGA: distributed congestion-aware load balancing for datacenters , 2015, SIGCOMM.

[14]  Ion Stoica,et al.  Efficient coflow scheduling with Varys , 2015, SIGCOMM.

[15]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[16]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[17]  David A. Maltz,et al.  Network traffic characteristics of data centers in the wild , 2010, IMC '10.

[18]  Ion Stoica,et al.  Coflow: a networking abstraction for cluster applications , 2012, HotNets-XI.