Data-bandwidth-aware Job Scheduling in Grid and Cluster Environments

This paper introduces techniques in scheduling jobs on a master/workers platform where the bandwidth is shared by all workers. The goal is to minimize the total makespan. The jobs are independent and each job requires a fixed amount of bandwidth to download input data before execution. The master can communicate with multiple workers simultaneously, provided that the bandwidth used by the master and the workers do not exceed their bandwidth limits. We proposed two models for this limited-bandwidth problem. If the data transfer cannot be interrupted, then we prove that the scheduling problem is NP-complete. Nevertheless we propose heuristic algorithms and experimentally test their performance. If the data transfer can be interrupted, we propose an algorithm that produces optimal makespan. The algorithm is based on a binary search on the completion time, and an efficient feasibility verification process for a given completion time.

[1]  Arnold L. Rosenberg,et al.  Optimal sharing of bags of tasks in heterogeneous clusters , 2003, SPAA '03.

[2]  Viktor K. Prasanna,et al.  Distributed adaptive task allocation in heterogeneous computing environments to maximize throughput , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[3]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[4]  Francine Berman,et al.  The AppLeS Parameter Sweep Template: User-Level Middleware for the Grid , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[5]  Arnaud Legrand,et al.  Non-Cooperative Scheduling of Multiple Bag-of-Task Applications , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[6]  Jeffrey D. Ullman,et al.  NP-Complete Scheduling Problems , 1975, J. Comput. Syst. Sci..

[7]  Ann Zimmerman,et al.  The Biomedical Informatics Research Network , 2008 .

[8]  Nicolas Bonichon,et al.  Scheduling divisibleworkloads on heterogeneous platforms under bounded multi-port model , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[9]  Debasish Ghose,et al.  Scheduling Divisible Loads in Parallel and Distributed Systems , 1996 .

[10]  Arnaud Legrand,et al.  On the Complexity of Multi-Round Divisible Load Scheduling , 2007 .

[11]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.