Program Repartitioning on Varying Communication Cost Parallel Architectures

In an earlier work, aThreshold Scheduling Algorithmwas proposed to schedule the functional (DAG) parallelism in a program on distributed memory systems. In this work, we address the issue of adapting the schedule for a set of distributed memory architectures with the same computation costs but higher communication costs. We introduce a new concept ofdominant edgesof a schedule to denote those edges which dictate the schedule time of the destination nodes due to the changes in their communication costs. Using this concept, we derive the conditions under which schedule on the whole or at least part of the graph can be reused for a different architecture keeping the cost of program repartitioning and rescheduling to a minimum. We demonstrate the practical significance of the method by incorporating it in the compiler backend for targeting Sisal (Streams and Iterations in a Single Assignment Language) on a family of Intel i860 architectures, Gamma, Delta, and Paragon, which vary in their communication costs. It is shown that almost 30 to 65% of the schedule can be reused unchanged, thereby avoiding program repartitioning to a large degree. The remainder of the schedule can beregeneratedthrough a linear algorithm at run time.

[1]  David C. Cann,et al.  A Report on the Sisal Language Project , 1990, J. Parallel Distributed Comput..

[2]  Thomas L. Casavant,et al.  A Taxonomy of Scheduling in General-Purpose Distributed Computing Systems , 1988, IEEE Trans. Software Eng..

[3]  Vivek Sarkar,et al.  Partitioning and scheduling parallel programs for execution on multiprocessors , 1987 .

[4]  Hesham H. Ali,et al.  An Optimal Algorithm for Scheduling Interval Ordered Tasks with Communication on N Processors , 1995, J. Comput. Syst. Sci..

[5]  Behrooz Shirazi,et al.  Analysis and Evaluation of Heuristic Methods for Static Task Scheduling , 1990, J. Parallel Distributed Comput..

[6]  Mihalis Yannakakis,et al.  Towards an Architecture-Independent Analysis of Parallel Algorithms , 1990, SIAM J. Comput..

[7]  Mayez A. Al-Mouhamed,et al.  Lower Bound on the Number of Processors and Time for Scheduling Precedence Graphs with Communication Costs , 1990, IEEE Trans. Software Eng..

[8]  Shahid H. Bokhari,et al.  On the Mapping Problem , 1981, IEEE Transactions on Computers.

[9]  Dharma P. Agrawal,et al.  A Task Duplication Based Optimal Scheduling Algorithm for Variable Execution Time Tasks , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.

[10]  Tao Yang,et al.  On the Granularity and Clustering of Directed Acyclic Task Graphs , 1993, IEEE Trans. Parallel Distributed Syst..

[11]  E.L. Lawler,et al.  Optimization and Approximation in Deterministic Sequencing and Scheduling: a Survey , 1977 .

[12]  Dharma P. Agrawal,et al.  A Scalable Scheduling Scheme for Functional Parallelism on Distributed Memory Multiprocessor Systems , 1995, IEEE Trans. Parallel Distributed Syst..

[13]  Jake K. Aggarwal,et al.  Generalized Mapping of Parallel Algorithms Onto Parallel Architectures , 1990, ICPP.

[14]  Frank D. Anger,et al.  Scheduling with Sufficient Loosely Coupled Processors , 1990, J. Parallel Distributed Comput..