Optimal Schedules for Cycle-Stealing in a Network of Workstations with a Bag-of-Tasks Workload

We refine the model underlying our prior work on scheduling bag-of-tasks ("embarrassingly parallel") workloads via cycle-stealing in networks of workstations (S.N. Bhatt et al., 1997; A.L. Rosenberg, 1999), obtaining a model wherein the scheduling guidelines of Rosenberg produce optimal schedules for every such cycle-stealing opportunity. We thereby render prescriptive the descriptive model of those sources. Although computing optimal schedules usually requires the use of general function-optimizing methods, we show how to compute optimal schedules efficiently for the broad class of opportunities whose durations come from a concave probability distribution. Even when no such efficient computation of an optimal schedule is available, our refined model often suggests a natural notion of approximately optimal schedule, which may be efficiently computable. We illustrate such efficient approximability via the important class of cycle-stealing opportunities whose durations come from a heavy-tailed distribution. Such opportunities do not admit any optimal schedule, nor even a natural notion of approximately optimal schedule, within the model of Bhatt and Rosenberg. Within our refined model, though, we derive computationally simple schedules for heavy-tailed opportunities, which can be "tuned" to accomplish an expected amount of work that is arbitrarily close to optimal.

[1]  Teunis J. Ott,et al.  Load-balancing heuristics and process behavior , 1986, SIGMETRICS '86/PERFORMANCE '86.

[2]  Keith Marzullo,et al.  The computational Co-op: Gathering clusters into a metacomputer , 1999, Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999.

[3]  Arnold L. Rosenberg,et al.  Sharing partitionable workloads in heterogeneous NOWs: greedier is not better , 2001, Proceedings 42nd IEEE Symposium on Foundations of Computer Science.

[4]  Arnold L. Rosenberg,et al.  An Optimal Strategies for Cycle-Stealing in Networks of Workstations , 1997, IEEE Trans. Computers.

[5]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[6]  Charles E. Leiserson,et al.  Space-Efficient Scheduling of Multithreaded Computations , 1998, SIAM J. Comput..

[7]  Amos Fiat,et al.  Making commitments in the face of uncertainty: how to pick a winner almost every time (extended abstract) , 1996, STOC '96.

[8]  David E. Culler,et al.  A case for NOW (networks of workstation) , 1995, PODC '95.

[9]  Arnold L. Rosenberg Guidelines for Data-Parallel Cycle-Stealing in Networks of Workstations II: On Maximizing Guaranteed Output , 2000, Int. J. Found. Comput. Sci..

[10]  Dan C. Marinescu,et al.  Models and Algorithms for Coscheduling Compute-Intensive Tasks on a Network of Workstations , 1992, J. Parallel Distributed Comput..

[11]  Dhabaleswar K. Panda,et al.  Efficient collective communication on heterogeneous networks of workstations , 1998, Proceedings. 1998 International Conference on Parallel Processing (Cat. No.98EX205).

[12]  Robert D. Blumofe,et al.  Scheduling large-scale parallel computations on networks of workstations , 1994, Proceedings of 3rd IEEE International Symposium on High Performance Distributed Computing.

[13]  Mihalis Yannakakis,et al.  Towards an Architecture-Independent Analysis of Parallel Algorithms , 1990, SIAM J. Comput..

[14]  Ramesh Subramonian,et al.  LogP: a practical model of parallel computation , 1996, CACM.

[15]  Franck Cappello,et al.  HiHCoHP-Toward a realistic communication model for hierarchical hyperclusters of heterogeneous processors , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[16]  Robert D. Blumofe,et al.  Scheduling multithreaded computations by work stealing , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[17]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[18]  Arnold L. Rosenberg Guidelines for data-parallel cycle-stealing in networks of workstations , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.