Abstract In this paper, we propose a heuristic for code partitioning for distributed memory multiprocessors (DMMs). Our method is data-flow based where all levels of parallelism can potentially be exploited. Given a weighted directed acyclic graph (DAG) representation of the program, our partitioning algorithm automatically determines the granularity of parallelism by partitioning the graph into tasks to be scheduled on the DMM. The granularity of parallelism depends only on the program to be executed and on the target machine parameters. The output of our algorithm is passed on as input to the scheduling phase. Unlike the scheduling problem as defined by Yang [A. Gerasoulis, T. Yang, IEEE Transactions on Parallel and Distributed Systems 4 (6) (1993) 686–701; T. Yang, Ph.D. Thesis, Rutgers University, New Brunswick, NJ, May 1993; T. Yang, A. Gerasoulis, IEEE Transactions on Parallel and Distributed Systems 5 (9) (1994) 951–967], the method presented in this paper uses task merging rather than task clustering . Finding an optimal solution to this problem is NP-complete. Due to the high cost of graph algorithms, it is nearly impossible to come up with close to optimal solutions that do not have very high cost (higher order polynomial). Therefore, our goal is to find a heuristic that gives good performance, and that has relatively low cost. Given a DAG with E edges and N nodes, the time complexity of our partitioning algorithm is O( E · N 3 ) in the worst case. For some cases, the average time complexity of the algorithm is O( N ( E + N )).
[1]
Ken Kennedy,et al.
Compiling Fortran D for MIMD distributed-memory machines
,
1992,
CACM.
[2]
M. Haines,et al.
Towards a distributed memory implementation of Sisal
,
1992,
Proceedings Scalable High Performance Computing Conference SHPCC-92..
[3]
Matthew Dennis Haines,et al.
Distributed runtime support for task and data management
,
1993
.
[4]
Vivek Sarkar,et al.
An automatically partitioning compiler for SISAL
,
1989
.
[5]
Tao Yang,et al.
On the Granularity and Clustering of Directed Acyclic Task Graphs
,
1993,
IEEE Trans. Parallel Distributed Syst..
[6]
Vivek Sarkar,et al.
Compile-time partitioning and scheduling of parallel programs
,
1986,
SIGPLAN '86.
[7]
Tao Yang,et al.
DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors
,
1994,
IEEE Trans. Parallel Distributed Syst..
[8]
Tao Yang,et al.
Scheduling and code generation for parallel architectures
,
1993
.
[9]
Jean-Luc Gaudiot,et al.
Automatic code partitioning for distributed-memory multiprocessors (dmms)
,
1996
.
[10]
Vivek Sarkar,et al.
Partitioning parallel programs for macro-dataflow
,
1986,
LFP '86.
[11]
Vivek Sarkar,et al.
Partitioning and scheduling parallel programs for execution on multiprocessors
,
1987
.