Codes Rajesh Bordawekar Alok Choudhary ECE Dept., 121, Link Hall, Syracuse University, Syracuse, NY 13244 frajesh, choudharg@cat.syr.edu J. Ramanujam ECE. Dept., Louisiana State University, Baton Rouge, LA 70803 jxr@gate.ee.lsu.edu Abstract In this paper, we describe a technique for optimizing communication for out-of-core distributed memory stencil problems. In these problems, communication may require both inter-processor communication and le I/O. We show that in certain cases, extra le I/O incurred in communication can be completely eliminated by reordering in-core computations. The in-core computation pattern is decided by: (1) how the out-of-core data distributed into in-core slabs (tiling) and (2) how the slabs are accessed. We show that a compiler using the stencil and processor information can choose the tiling parameters and schedule the tile accesses so that the extra le I/O is eliminated and overall performance is improved.
[1]
G. C. Fox,et al.
Solving Problems on Concurrent Processors
,
1988
.
[2]
Rajesh R. Bordawekar,et al.
Techniques for compiling i/o intensive parallel programs
,
1996
.
[3]
S. L. Johnsson,et al.
Designing a stencil compiler for the Connection Machine model CM-5
,
1994
.
[4]
Guy L. Steele,et al.
Fortran at ten gigaflops: the connection machine convolution compiler
,
1991,
PLDI '91.
[5]
Dan I. Moldovan,et al.
Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays
,
1986,
IEEE Transactions on Computers.
[6]
Fung F. Lee.
Partitioning of Regular Computation on Multiprocessor Systems
,
1990,
J. Parallel Distributed Comput..