Compiler techniques for determining data distribution and generating communication sets on distributed-memory machines

The paper is concerned with designing efficient algorithms for determining data distribution and generating communication sets on distributed memory multicomputers. First, we propose a dynamic programming algorithm to automatically determine data distribution at compiling time. The proposed algorithm also can determine whether data redistribution is necessary between two consecutive DO-loop program fragments. Second, we propose closed forms to represent communication sets among processing elements for executing doall statements, when data arrays are distributed in a restricted block-cyclic fashion. Our methods can be included in current compilers and used when programmers fail to provide any data distribution directives. Experimental studies on a nCUBE-2 multicomputer are also presented.

[1]  Barbara M. Chapman,et al.  Automatic Support for Data Distribution on Distributed Memory Multiprocessor Systems , 1993, LCPC.

[2]  P. Sadayappan,et al.  An Approach to Communication-eecient Data Redistribution , 1994 .

[3]  Charles Koelbel Compile-time generation of regular communications patterns , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[4]  Sandeep K. S. Gupta,et al.  On Compiling Array Expressions for Efficient Execution on Distributed-Memory Machines , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[5]  Guy L. Steele,et al.  The High Performance Fortran Handbook , 1993 .

[6]  Sandeep K. S. Gupta,et al.  On Compiling Array Expressions for Efficient Execution on Distributed-Memory Machines , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[7]  Thomas R. Gross,et al.  Generating Communication for Array Statement: Design, Implementation, and Evaluation , 1994, J. Parallel Distributed Comput..

[8]  Peter Brezany,et al.  Processing Array Statements and Procedure Interfaces in the PREPARE High Performance Fortran Compiler , 1994, CC.

[9]  John R. Gilbert,et al.  Generating local addresses and communication sets for data-parallel programs , 1993, PPOPP '93.

[10]  Ken Kennedy,et al.  Efficient address generation for block-cyclic distributions , 1995, ICS '95.

[11]  Marina C. Chen,et al.  The Data Alignment Phase in Compiling Programs for Distrubuted-Memory Machines , 1991, J. Parallel Distributed Comput..

[12]  Marina C. Chen,et al.  Compiling Communication-Efficient Programs for Massively Parallel Machines , 1991, IEEE Trans. Parallel Distributed Syst..

[13]  Paul Hudak,et al.  Memory coherence in shared virtual memory systems , 1989, TOCS.

[14]  Manish Gupta,et al.  Compile-time estimation of communication costs on multicomputers , 1992, Proceedings Sixth International Parallel Processing Symposium.

[15]  J. Ramanujam,et al.  Multi-phase array redistribution: modeling and evaluation , 1995, Proceedings of 9th International Parallel Processing Symposium.

[16]  Ken Kennedy,et al.  A linear-time algorithm for computing the memory access sequence in data-parallel programs , 1995, PPOPP '95.

[17]  Charles Koelbel,et al.  Compiling Global Name-Space Parallel Loops for Distributed Execution , 1991, IEEE Trans. Parallel Distributed Syst..

[18]  Wade Ellis,et al.  A tutorial introduction to Derive , 1991 .

[19]  Monica S. Lam,et al.  A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..

[20]  Ken Kennedy,et al.  Automatic Data Layout for Distributed-Memory Machines in the D Programming Environment , 1994, Automatic Parallelization.

[21]  Ken Kennedy,et al.  Compilation techniques for block-cyclic distributions , 1994, International Conference on Supercomputing.

[22]  Ken Kennedy,et al.  Compiling Fortran D for MIMD distributed-memory machines , 1992, CACM.

[23]  Ken Kennedy,et al.  Evaluation of compiler optimizations for Fortran D on MIMD distributed memory machines , 1992, ICS '92.

[24]  Lionel M. Ni,et al.  Processor mapping techniques toward efficient data redistribution , 1994, Proceedings of 8th International Parallel Processing Symposium.

[25]  Manish Gupta,et al.  Demonstration of Automatic Data Partitioning Techniques for Parallelizing Compilers on Multicomputers , 1992, IEEE Trans. Parallel Distributed Syst..

[26]  Geoffrey C. Fox,et al.  Runtime array redistribution in HPF programs , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[27]  Ken Kennedy,et al.  Compilation techniques for block-cyclic distributions , 1994 .