Reducing Communication Cost for Parallelizing Irregular Scientific Codes

In most cases of distributed memory computations, node programs are executed on processors according to the owner computes rule. However, owner computes rule is not best suited for irregular application codes. In irregular application codes, use of indirection in accessing left hand side array makes it difficult to partition the loop iterations, and because of use of indirection in accessing right hand side elements, we may reduce total communication by using heuristics other than owner computes rule. In this paper, we propose a communication cost reduction computes rule for irregular loop partitioning, called least communication computes rule. We partition a loop iteration to a processor on which the minimal communication cost is ensured when executing that iteration. The experimental results show that, in most cases, our approaches achieved better performance than other loop partitioning rules.

[1]  Minyi Guo,et al.  A Framework for Efficient Data Redistribution on Distributed Memory Multicomputers , 2001, The Journal of Supercomputing.

[2]  Alan L. Cox,et al.  Improving Fine-Grained Irregular Shared-Memory Benchmarks by Data Reordering , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[3]  Joel H. Saltz,et al.  Interprocedural Communication Optimizations for Distributed Memory Compilation , 1994, LCPC.

[4]  Ken Kennedy,et al.  Improving memory hierarchy performance for irregular applications , 1999, ICS '99.

[5]  Emilio L. Zapata,et al.  On Automatic Parallelization of Irregular Reductions on Scalable Shared Memory Systems , 1999, Euro-Par.

[6]  Geoffrey C. Fox,et al.  RUNTIME SUPPORT AND COMPILATION METHODS FOR USER-SPECIFIED DATE DISTRIBUTIONS , 1993 .

[7]  Geoffrey C. Fox,et al.  Supporting irregular distributions in FORTRAN 90D/HPF compilers , 1994 .

[8]  M. Guo A framework for efficient array redistribution on distributed memory machines , 2001 .

[9]  M. Norman,et al.  ZEUS-2D: A radiation magnetohydrodynamics code for astrophysical flows in two space dimensions. I - The hydrodynamic algorithms and tests. II - The magnetohydrodynamic algorithms and tests , 1992 .

[10]  Joel H. Saltz,et al.  Communication Optimizations for Irregular Scientific Computations on Distributed Memory Architectures , 1994, J. Parallel Distributed Comput..

[11]  Shang-Hua Teng,et al.  High performance Fortran for highly irregular problems , 1997, PPOPP '97.

[12]  J. Saltz,et al.  Interprocedural Compilation of Irregular Applications for Distributed Memory Machines , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[13]  Larry Carter,et al.  Localizing non-affine array references , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).

[14]  Allen,et al.  Optimizing Compilers for Modern Architectures , 2004 .

[15]  M. Norman,et al.  ZEUS-2D : a radiation magnetohydrodynamics code for astrophysical flows in two space dimensions. II : The magnetohydrodynamic algorithms and tests , 1992 .

[16]  Ken Kennedy,et al.  Optimizing Compilers for Modern Architectures: A Dependence-based Approach , 2001 .

[17]  Chau-Wen Tseng,et al.  Improving compiler and run-time support for adaptive irregular codes , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).

[18]  Minyi Guo,et al.  Contention-free communication scheduling for array redistribution , 1998, Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250).

[19]  Geoffrey C. Fox,et al.  Runtime Support and Compilation Methods for User-Specified Irregular Data Distributions , 1995, IEEE Trans. Parallel Distributed Syst..

[20]  Joel H. Saltz,et al.  Runtime and language support for compiling adaptive irregular programs on distributed‐memory machines , 1995, Softw. Pract. Exp..

[21]  Ken Kennedy,et al.  Improving cache performance in dynamic applications through data and computation reorganization at run time , 1999, PLDI '99.

[22]  Yi Pan,et al.  Symbolic Communication Set Generation for Irregular Parallel Applications , 2003, The Journal of Supercomputing.

[23]  Barbara M. Chapman,et al.  Vienna-Fortran/HPF Extensions for Sparse and Irregular Problems and Their Compilation , 1997, IEEE Trans. Parallel Distributed Syst..