On a Local Protocol for Concurrent File Transfers

We study a very natural local protocol for a file transfer problem. Consider a scenario where several files, which may have varied sizes and get created over a period of time, are to be transferred between pairs of hosts in a distributed environment. Our protocol assumes that while executing the file transfers, an individual host does not use any global knowledge; and simply subdivides its I/O resources equally among all the active file transfers at that host at any point in time. This protocol is motivated by its simplicity of use and its applications to scheduling map-reduce workloads.Here we study the problem of deciding the start times of individual file transfers to optimize QoS metrics like average completion time or MakeSpan. To begin with, we show that these problems are NP-hard. We next argue that the ability of scheduling multiple concurrent file transfers at a host makes our protocol stronger than previously studied protocols that schedule a sequence of matchings, in which no two active file transfers share a host at any time. We then generalize the approach of Queyranne and Sviridenko (J. Algorithms 45:202–212, 2002) and Gandhi et al. (ACM Trans. Algorithms 4(1), 2008) that relates the MakeSpan and completion time objectives and present constant factor approximation algorithms.

[1]  Joseph Hall,et al.  On algorithms for efficient data migration , 2001, SODA '01.

[2]  Han Hoogeveen,et al.  Non-Approximability Results for Scheduling Problems with Minsum Criteria , 1998, INFORMS J. Comput..

[3]  Mark K. Goldberg,et al.  Edge-coloring of multigraphs: Recoloring technique , 1984, J. Graph Theory.

[4]  Rajiv Gandhi,et al.  Improved bounds for scheduling conflicting jobs with minsum criteria , 2008, TALG.

[5]  Guy Kortsarz,et al.  Min Sum Edge Coloring in Multigraphs Via Configuration LP , 2008, IPCO.

[6]  Ronald L. Graham,et al.  Bounds for certain multiprocessing anomalies , 1966 .

[7]  Mihir Bellare,et al.  On Chromatic Sums and Distributed Resource Allocation , 1998, Inf. Comput..

[8]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[9]  Dániel Marx,et al.  Minimum sum multicoloring on the edges of trees , 2006, Theor. Comput. Sci..

[10]  Yoo-Ah Kim,et al.  Data migration to minimize the total completion time , 2005, J. Algorithms.

[11]  Takao Nishizeki,et al.  A Better than "Best Possible" Algorithm to Edge Color Multigraphs , 1986, J. Algorithms.

[12]  Maxim Sviridenko,et al.  A (2+epsilon)-approximation algorithm for the generalized preemptive open shop problem with minsum objective , 2002, J. Algorithms.

[13]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[14]  Edward G. Coffman,et al.  Scheduling File Transfers , 1985, SIAM J. Comput..

[15]  Dániel Marx,et al.  Minimum Sum Multicoloring on the Edges of Planar Graphs and Partial k-Trees , 2004, WAOA.

[16]  Maxim Sviridenko,et al.  Approximation algorithms for shop scheduling problems with minsum objective , 2002 .

[17]  Samir Khuller,et al.  Algorithms for data migration with cloning , 2003, SIAM J. Comput..

[18]  Peter Sanders,et al.  An asymptotic approximation scheme for multigraph edge coloring , 2005, SODA '05.

[19]  Cynthia A. Phillips,et al.  Improved Scheduling Algorithms for Minsum Criteria , 1996, ICALP.

[20]  Maxim Sviridenko,et al.  A (2+epsilon)-Approximation Algorithm for Generalized Preemptive Open Shop Problem with Minsum Objective , 2001, IPCO.

[21]  Rajiv Gandhi,et al.  Improved results for data migration and open shop scheduling , 2004, TALG.

[22]  Yoo-Ah Kim,et al.  Data migration to minimize the average completion time , 2003, SODA '03.

[23]  Joseph Hall,et al.  An Experimental Study of Data Migration Algorithms , 2001, WAE.

[24]  Rajiv Gandhi,et al.  Combinatorial Algorithms for Data Migration to Minimize Average Completion Time , 2006, Algorithmica.

[25]  J. A. Bondy,et al.  Graph Theory , 2008, Graduate Texts in Mathematics.

[26]  Guy Kortsarz,et al.  Tools for Multicoloring with Applications to Planar Graphs and Partial k-Trees , 2002, J. Algorithms.