A Compilation Technique for Varying Communication Cost NUMA Architectures

In an earlier work, a Threshold Scheduling Algorithm was proposed to schedule the functional parallelism in a program on distributed memory systems. In this work, we address the issue of regeneration of the schedule for a set of distributed memory architectures with different communication costs. A new concept of dominant edges of a schedule is introduced to denote those edges which dictate schedule regeneration due to the changes in their communication costs. It is shown that under certain conditions, schedule on the whole or at least part of the graph can be reused for a different architecture reducing the cost of program re-partitioning and re-scheduling. The usefulness of this method is demonstrated by incorporating it in the scheduler of the compiler backend for targeting Sisal (Streams and Iterations in a Single Assignment Language) on a family of Intel i860 architectures: Gamma, Delta and Paragon which vary in their communication costs. It is shown that almost 30 to 65 % of the schedule can be reused, thereby, avoiding program re-partitioning to a large degree.

[1]  Francine Berman,et al.  On Mapping Parallel Algorithms into Parallel Architectures , 1987, J. Parallel Distributed Comput..

[2]  David C. Cann,et al.  A Report on the Sisal Language Project , 1990, J. Parallel Distributed Comput..

[3]  Mihalis Yannakakis,et al.  Towards an Architecture-Independent Analysis of Parallel Algorithms , 1990, SIAM J. Comput..

[4]  David C. Cann,et al.  Retire Fortran?: a debate rekindled , 1992, CACM.

[5]  Tao Yang,et al.  A Comparison of Clustering Heuristics for Scheduling Directed Acycle Graphs on Multiprocessors , 1992, J. Parallel Distributed Comput..

[6]  E.L. Lawler,et al.  Optimization and Approximation in Deterministic Sequencing and Scheduling: a Survey , 1977 .

[7]  Hesham El-Rewini,et al.  Scheduling Parallel Program Tasks onto Arbitrary Target Machines , 1990, J. Parallel Distributed Comput..

[8]  Vivek Sarkar,et al.  Partitioning and scheduling parallel programs for execution on multiprocessors , 1987 .

[9]  Richard Cole,et al.  Approximate Parallel Scheduling. II. Applications to Logarithmic-Time Optimal Parallel Graph Algorithms , 1991, Inf. Comput..

[10]  Behrooz Shirazi,et al.  Analysis and Evaluation of Heuristic Methods for Static Task Scheduling , 1990, J. Parallel Distributed Comput..

[11]  Dharma P. Agrawal,et al.  A fully automatic compiler for distributed memory machines , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.

[12]  Thomas L. Casavant,et al.  A Taxonomy of Scheduling in General-Purpose Distributed Computing Systems , 1988, IEEE Trans. Software Eng..