Scheduling Tasks Sharing Files from Distributed Repositories

This paper is devoted to scheduling a large collection of independent tasks onto a distributed heterogeneous platform, which is composed of a set of servers. Each server is a processor cluster equipped with a file repository. The tasks to be scheduled depend upon (input) files which initially reside on the server repositories. A given file may well be shared by several tasks. For each task, the problem is to decide which server will execute it, and to transfer the required files to that server repository. The objective is to find a task allocation, and to schedule the induced communications, so as to minimize the total execution time. The contribution of this paper is twofold. On the theoretical side, we establish a complexity result that assesses the difficulty of the problem. On the practical side, we design several new heuristics, including an extension of the min-min heuristic to such a decentralized framework, and several lower cost heuristics, which we compare through extensive simulations.

[1]  R. F. Freund,et al.  Dynamic Mapping of a Class of Independent Tasks onto Heterogeneous Computing Systems , 1999, J. Parallel Distributed Comput..

[2]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[3]  Manuel Blum,et al.  Time Bounds for Selection , 1973, J. Comput. Syst. Sci..

[4]  Ali R. Hurson,et al.  Scheduling and Load Balancing in Parallel and Distributed Systems , 1995 .

[5]  R. F. Freund,et al.  Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems , 1999, Proceedings. Eighth Heterogeneous Computing Workshop (HCW'99).

[6]  Alexander Schrijver,et al.  Combinatorial optimization. Polyhedra and efficiency. , 2003 .

[7]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[8]  Francine Berman,et al.  Heuristics for scheduling parameter sweep applications in grid environments , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).

[9]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[10]  Francine Berman,et al.  High-performance schedulers , 1998 .

[11]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[12]  Yves Robert,et al.  Scheduling tasks sharing files on heterogeneous master-slave platforms , 2004, 12th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2004. Proceedings..

[13]  Yves Robert,et al.  Scheduling Tasks Sharing Files from Distributed Repositories , 2004, Euro-Par.

[14]  Francine Berman,et al.  Using Simulation to Evaluate Scheduling Heuristics for a Class of Applications in Grid Environments , 1999 .

[15]  Yves Robert,et al.  Scheduling Tasks Sharing Files on Heterogeneous Master-Slave Platforms , 2004, PDP.

[16]  Z Liu,et al.  Scheduling Theory and its Applications , 1997 .

[17]  Yves Robert,et al.  Scheduling Tasks Sharing Files on Heterogeneous Clusters , 2003, PVM/MPI.