Extending high performance Fortran for the support of unstructured computations

We have extended an existing HPF compiler with language features designed for the parallelization of unstructured computations on multicomputers. The language extensions include block-general distributions and dynamic data distributions specified through userdefined mapping arrays and finctions. A prototype compiler has been implemented which features dlfherent run-time preprocessing mechanisms and also allows clean integration of explicit messagepassing primitives. The compiler is developed as part of a complete multicomputer programming environment being used by a group of application developers in the framework of the Joint CSCS– ETH/NEC Collaboration in Parallel Processing. As such, it is supported by a high-level debugger andpe~oimance monitoc and the usability and efficiency of generated parallel programs is validated by the application developers. In this pape~ we summarize the programming paradigm implemented through HPF extensions, and detail the respective compiler directives. We describe the implemented run-time preprocessing mechanisms and evaluate the efjlciency of compiler-generated code on an NEC Cenju-3 multicomputez

[1]  Y. Saad,et al.  Krylov Subspace Methods on Supercomputers , 1989 .

[2]  Brian J. N. Wylie,et al.  The "Annai" environment for portable distributed parallel programming , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[3]  Youcef Saad,et al.  A Basic Tool Kit for Sparse Matrix Computations , 1990 .

[4]  Gene H. Golub,et al.  Matrix computations , 1983 .

[5]  Frank Tip,et al.  Parametric program slicing , 1995, POPL '95.

[6]  Joel H. Saltz,et al.  Run-Time Parallelization and Scheduling of Loops , 1991, IEEE Trans. Computers.

[7]  Wu Ling,et al.  Plump: Parallel Library for Unstructured Mesh Problems , 1995 .

[8]  Barbara M. Chapman,et al.  Extending HPF for Advanced Data-Parallel Applications , 1994, IEEE Parallel & Distributed Technology: Systems & Applications.

[9]  Harry Berryman,et al.  Run-Time Scheduling and Execution of Loops on Message Passing Machines , 1990, J. Parallel Distributed Comput..

[10]  Uwe Meyer,et al.  Techniques for partial evaluation of imperative languages , 1991, PEPM '91.

[11]  Marco Annaratone,et al.  The K2 distributed memory parallel processor: architecture, compiler, and operating system , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[12]  Harry Berryman,et al.  Execution time support for adaptive scientific algorithms on distributed memory machines , 1991, Concurr. Pract. Exp..

[13]  Anthony P. Reeves,et al.  Data remapping for distributed-memory multicomputers , 1992, Proceedings Scalable High Performance Computing Conference SHPCC-92..

[14]  Ken Kennedy,et al.  Compiler optimizations for Fortran D on MIMD distributed-memory machines , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[15]  Claude Pommerell,et al.  Solution of large unsymmetric systems of linear equations , 1992 .

[16]  Roland Rühl A parallelizing compiler for distributed memory parallel processors , 1992 .

[17]  Wolfgang Fichtner,et al.  A Set of New Mapping and Coloring Heuristics for Distributed-Memory Parallel Processors , 1992, SIAM J. Sci. Comput..

[18]  Charles Koelbel,et al.  Supporting shared data structures on distributed memory architectures , 1990, PPOPP '90.

[19]  Roland Rühl,et al.  Migration of Vectorized Iterative Solvers to Distributed-Memory Architectures , 1996, SIAM J. Sci. Comput..

[20]  Joel H. Saltz,et al.  Run-time parallelization and scheduling of loops , 1989, SPAA '89.

[21]  Roland Rühl A parallelizing compiler for distributed memory parallel processors , 1992 .

[22]  Anita Osterhaug Guide to parallel programming on Sequent computer systems , 1989 .

[23]  Roland Rühl Evaluation of compiler generated parallel programs on three multicomputers , 1992, ICS '92.

[24]  Brian J. N. Wylie,et al.  An Environment for Portable Distributed Memory Parallel Programming , 1994 .

[25]  Joel H. Saltz,et al.  Slicing Analysis and Indirect Accesses to Distributed Arrays , 1993, LCPC.

[26]  Utpal Banerjee,et al.  Speedup of ordinary programs , 1979 .

[27]  Guy L. Steele,et al.  The High Performance Fortran Handbook , 1993 .