In machines like the Intel iPSC/2 and the BBN Butterfly, local memory operations are much faster than inter-processor communication. When writing programs for these machines, programmers must worry about exploiting spatial locality of reference. This is tedious and reduces the level of abstraction at the which the programmer works. We are implementing a parallelizing compiler that will shoulder much of that burden. Given a sequential, shared memory program and a specification of how data structures are to be mapped across the processors, our compiler will perform process decomposition to exploit locality of reference. In this paper, we discuss some experiments in parallelizing SIMPLE, a large scientific benchmark from Los Alamos, for the Intel iPSC/2.
[1]
Anne Rogers.
Compiling for locality of reference
,
1990
.
[2]
Francine Berman,et al.
Prep-P: A Mapping Preprocessor for CHiP Architectures
,
1985,
ICPP.
[3]
Dirk Roose,et al.
Benchmarking the iPSC/2 Hypercube Multiprocessor
,
1989,
Concurr. Pract. Exp..
[4]
Tsutomu Hoshino,et al.
An Invitation to the World of PAX
,
1986,
Computer.
[5]
Anne Rogers,et al.
Process decomposition through locality of reference
,
1989,
PLDI '89.
[6]
David A. Padua,et al.
Advanced compiler optimizations for supercomputers
,
1986,
CACM.