Scheduling Coflows with Incomplete Information

In recent years, the coflow abstraction has received significant attentions, for its prominent ability to capture application semantics. On this basis, multiple coflow scheduling mechanisms have been proposed to minimize the coflow completion time (CCT). Currently, existing coflow scheduling mechanisms mainly belong to two categories: information-omniscient and information-agnostic. However, in data center applications, there are still quite a few cases in between where incomplete coflow information is known, and such incomplete information makes great contributions to improving the CCT performance. To address such cases, we propose IICS, a coflow scheduling algorithm based on incomplete coflow information. IICS leverages information of a coflow's arrived parts to deduce the coflow's remaining transmission time, and uses it to approximate the Minimum Remaining Time First (MRTF) heuristic. Besides, IICS allocates bandwidth by monopolization and in a maximal manner, which achieves high bandwidth utilization. Extensive simulations under realistic settings show that IICS achieves the average CCT comparable to that of the information-omniscient algorithm and the 99th percentile CCT much smaller than both information-omniscient and information-agnostic algorithms. Furthermore, IICS holds observably higher throughput and is robust to algorithm parameters.

[1]  Amin Vahdat,et al.  Hedera: Dynamic Flow Scheduling for Data Center Networks , 2010, NSDI.

[2]  Antony I. T. Rowstron,et al.  Decentralized task-aware scheduling for data center networks , 2014, SIGCOMM.

[3]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[4]  Yuan Zhong,et al.  Minimizing the Total Weighted Completion Time of Coflows in Datacenter Networks , 2015, SPAA.

[5]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[6]  Srikanth Kandula,et al.  PACMan: Coordinated Memory Caching for Parallel Jobs , 2012, NSDI.

[7]  Ion Stoica,et al.  Efficient coflow scheduling with Varys , 2015, SIGCOMM.

[8]  Yanhui Geng,et al.  CODA: Toward Automatically Identifying and Scheduling Coflows in the Dark , 2016, SIGCOMM.

[9]  Sheng Wang,et al.  Rapier: Integrating routing and scheduling for coflow-aware data center networks , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[10]  Wei Bai,et al.  Information-Agnostic Flow Scheduling for Commodity Data Centers , 2015, NSDI.

[11]  Michael I. Jordan,et al.  Managing data transfers in computer clusters with orchestra , 2011, SIGCOMM.

[12]  Feng Li,et al.  Skipping congestion-links for coflow scheduling , 2017, 2017 IEEE/ACM 25th International Symposium on Quality of Service (IWQoS).

[13]  T. S. Eugene Ng,et al.  Sunflow: Efficient Optical Circuit Scheduling for Coflows , 2016, CoNEXT.

[14]  Bo Li,et al.  Adia: Achieving High Link Utilization with Coflow-Aware Scheduling in Data Center Networks , 2019, IEEE Transactions on Cloud Computing.

[15]  Sheng Wang,et al.  Towards Practical and Near-Optimal Coflow Scheduling for Data Center Networks , 2016, IEEE Transactions on Parallel and Distributed Systems.

[16]  Ion Stoica,et al.  Efficient Coflow Scheduling Without Prior Knowledge , 2015, SIGCOMM.

[17]  Nick McKeown,et al.  pFabric: minimal near-optimal datacenter transport , 2013, SIGCOMM.

[18]  Kai Chen,et al.  Stream: Decentralized opportunistic inter-coflow scheduling for datacenter networks , 2016, 2016 IEEE 24th International Conference on Network Protocols (ICNP).

[19]  Ion Stoica,et al.  Coflow: a networking abstraction for cluster applications , 2012, HotNets-XI.

[20]  George Varghese,et al.  CONGA: distributed congestion-aware load balancing for datacenters , 2015, SIGCOMM.