Compiling Fortran 90D/HPF for Distributed Memory MIMD Computers

Distributed memory multiprocessors are increasingly being used to provide high performance for advanced calculations with scientific applications. Distributed memory machines offer significant advantages over their shared memory counterparts in terms of cost and scalability, though it is widely accepted that they are difficult to program given the current status of software technology. Currently, distributed memory machines are programmed using a node language and a message passing library. This process is tedious and error prone because the user must perform the task of data distribution and communication for non-local data access. This thesis describes an advanced compiler that can generate efficient parallel programs when the source programming language naturally represents an application's parallelism. Fortran 90D/HPF described in this thesis is such a language. Using Fortran 90D/HPF, parallelism is represented with parallel constructs, such as array operations, where statements, forall statements, and intrinsic functions. The language provides directives for data distribution. Fortran 90D/HPF gives the programmer powerful tools to express a problem with natural data parallelism. To validate this hypothesis, a prototype of Fortran 90D/HPF was implemented. The compiler is organized around several major units: language parsing, partitioning data and computation, detecting communication and generating code. The compiler recognizes the presence of communication patterns in the computations in order to generate appropriate communication calls. Specifically, this involves a number of tests on the relationships among subscripts of various arrays in a statement. The compiler includes a specially designed algorithm to detect communications and to generate appropriate collective communication calls to execute array assignments and forall statements. The Fortran 90D/HPF compiler performs several types of communication and computation optimizations to improve the performance of the generated code. Empirical measurements show that the performance of the output of the Fortran 90D/HPF compiler is comparable to that of corresponding hand-written codes on several systems. We hope that this thesis assists in the widespread adoption of parallel computing technology and leads to a more attractive and powerful software development environment to support application parallelism that many users need.

[1]  Geoffrey C. Fox,et al.  Fortran 90D intrinsic functions on distributed memory machines: implementation and scalability , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.

[2]  Alan H. Karp,et al.  A comparison of 12 parallel FORTRAN dialects , 1988, IEEE Software.

[3]  James H. Patterson,et al.  Portable Programs for Parallel Processors , 1987 .

[4]  Barbara M. Chapman,et al.  Supercompilers for parallel and vector computers , 1990, ACM Press frontier series.

[5]  Harry Berryman,et al.  Multiprocessors and run-time compilation , 1991, Concurr. Pract. Exp..

[6]  Ken Kennedy,et al.  An Interactive Environment for Data Partitioning and Distribution , 1990, Proceedings of the Fifth Distributed Memory Computing Conference, 1990..

[7]  S. Lennart Johnsson,et al.  Performance Modeling of Distributed Memory Architectures , 1991, J. Parallel Distributed Comput..

[8]  Philip J. Hatcher,et al.  Compiling C* Programs for a Hypercube Multicomputer , 1988, PPOPP/PPEALS.

[9]  John H. Merlin,et al.  Techniques for the Automatic Parallelisation of `Distributed Fortran 90 , 1992 .

[10]  Manish Gupta,et al.  Demonstration of Automatic Data Partitioning Techniques for Parallelizing Compilers on Multicomputers , 1992, IEEE Trans. Parallel Distributed Syst..

[11]  Harry Berryman,et al.  Distributed Memory Compiler Design for Sparse Problems , 1995, IEEE Trans. Computers.

[12]  Manish Gupta,et al.  Automatic Data Partitioning on Distributed Memory Multicomputers , 1992 .

[13]  Ken Kennedy,et al.  PFC: A Program to Convert Fortran to Parallel Form , 1982 .

[14]  Marina C. Chen Optimizing FORTRAN-90 Programs for Data Motion on Massively Parallel Systems , 1992 .

[15]  Yves Robert,et al.  Evaluating Array Expressions On Massively Parallel Machines With Communication/ Computation Overlap , 1994, Int. J. High Perform. Comput. Appl..

[16]  Alok Choudhary,et al.  Runtime compilation techniques for data partitioning and communication schedule reuse , 1993, Supercomputing '93.

[17]  Charles Koelbel,et al.  Compiling Global Name-Space Parallel Loops for Distributed Execution , 1991, IEEE Trans. Parallel Distributed Syst..

[18]  Marina C. Chen,et al.  The Data Alignment Phase in Compiling Programs for Distrubuted-Memory Machines , 1991, J. Parallel Distributed Comput..

[19]  G. Sabot A compiler for a massively parallel distributed memory MIMD computer , 1992, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation.

[20]  Ken Kennedy,et al.  Fortran D Language Specification , 1990 .

[21]  Chau-Wen Tseng An optimizing Fortran D compiler for MIMD distributed-memory machines , 1993 .

[22]  Michael Gerndt,et al.  SUPERB: A tool for semi-automatic MIMD/SIMD parallelization , 1988, Parallel Comput..

[23]  Geoffrey C. Fox,et al.  Domain Decomposition in Distributed and Shared Memory Environments. I: A Uniform Decomposition and Performance Analysis for the NCUBE and JPL Mark IIIfp Hypercubes , 1987, ICS.

[24]  Marina C. Chen,et al.  Compiling Communication-Efficient Programs for Massively Parallel Machines , 1991, IEEE Trans. Parallel Distributed Syst..

[25]  Michael Gerndt,et al.  Updating Distributed Variables in Local Computations , 1990, Concurr. Pract. Exp..

[26]  Harry Berryman,et al.  A manual for PARTI runtime primitives , 1990 .

[27]  Hans P. Zima,et al.  Automatic Support for Data Distribution , 1991, The Sixth Distributed Memory Computing Conference, 1991. Proceedings.

[28]  John R. Gilbert,et al.  Automatic array alignment in data-parallel programs , 1993, POPL '93.

[29]  Charles Koelbel,et al.  Supporting shared data structures on distributed memory architectures , 1990, PPOPP '90.

[30]  J. Ramanujam,et al.  Compile-Time Techniques for Data Distribution in Distributed Memory Machines , 1991, IEEE Trans. Parallel Distributed Syst..

[31]  John R. Gilbert,et al.  Optimal evaluation of array expressions on massively parallel machines , 1995, TOPL.

[32]  Geoffrey C. Fox,et al.  An Automatic and Symbolic Parallelization System for Distributed Memory Parallel Computers , 1990, Proceedings of the Fifth Distributed Memory Computing Conference, 1990..

[33]  David A. Padua,et al.  Dependence graphs and compiler optimizations , 1981, POPL '81.

[34]  John R. Gilbert,et al.  Generating Local Address and Communication Sets for Data-Parallel Programs , 1995, J. Parallel Distributed Comput..

[35]  John R. Gilbert,et al.  Generating local addresses and communication sets for data-parallel programs , 1993, PPOPP '93.

[36]  Geoffrey C. Fox,et al.  Applications Benchmark Set for Fortran-D and High Performance Fortran , 1992 .

[37]  W. Daniel Hillis,et al.  The connection machine , 1985 .

[38]  Geoffrey C. Fox,et al.  Benchmarking the CM-5 multicomputer , 1992, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation.

[39]  Harry Berryman,et al.  Run-Time Scheduling and Execution of Loops on Message Passing Machines , 1990, J. Parallel Distributed Comput..

[40]  Kim Mills,et al.  A large scale comparison of option pricing models with historical market data , 1992, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation.

[41]  L. W. Tucker,et al.  Architecture and applications of the Connection Machine , 1988, Computer.

[42]  Guy L. Steele,et al.  Data Optimization: Allocation of Arrays to Reduce Communication on SIMD Machines , 1990, J. Parallel Distributed Comput..

[43]  Guy L. Steele,et al.  Compiling Fortran 8x array features for the connection machine computer system , 1988, PPEALS '88.

[44]  Thomas G. Macdonald,et al.  MPP Fortran Programming Model , 1992 .

[45]  G. C. Fox,et al.  What have we learnt from using real parallel machines to solve real problems? , 1989, C3P.

[46]  Ken Kennedy,et al.  Computer support for machine-independent parallel programming in Fortran D , 1992 .

[47]  V. Rich Personal communication , 1989, Nature.

[48]  Geoffrey C. Fox Parallel Computing Comes of Age: Supercomputer Level Parallel Computations at Caltech , 1989, Concurr. Pract. Exp..

[49]  Geoffrey Fox,et al.  Achievements and prospects for parallel computing , 1991, Concurr. Pract. Exp..

[50]  Milind Girkar,et al.  Parafrase-2: an Environment for Parallelizing, Partitioning, Synchronizing, and Scheduling Programs on Multiprocessors , 1989, Int. J. High Speed Comput..

[51]  Jack Dongarra,et al.  A User''s Guide to PVM Parallel Virtual Machine , 1991 .

[52]  Philip J. Hatcher,et al.  A production-quality C* compiler for Hypercube multicomputers , 1991, PPOPP '91.

[53]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[54]  Daniel Gajski,et al.  A Programming Aid for Message-passing Systems , 1987, PPSC.

[55]  Alan H. Karp,et al.  Programming for Parallelism , 1987, Computer.

[56]  Geoffrey C. Fox,et al.  Compiling distribution directives in a Fortran 90D compiler , 1993, Proceedings of 1993 5th IEEE Symposium on Parallel and Distributed Processing.

[57]  John A. Chandy,et al.  Communication Optimizations Used in the Paradigm Compiler for Distributed-Memory Multicomputers , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.