Distributed Memory Compiler Design for Sparse Problems

This paper addresses the issue of compiling concurrent loop nests in the presence of complicated array references and irregularly distributed arrays. Arrays accessed within loops may contain accesses that make it impossible to precisely determine the reference pattern at compile time. This paper proposes a run time support mechanism that is used effectively by a compiler to generate efficient code in these situations. The compiler accepts as input a Fortran 77 program enhanced with specifications for distributing data, and outputs a message passing program that runs on the nodes of a distributed memory machine. The runtime support for the compiler consists of a library of primitives designed to support irregular patterns of distributed array accesses and irregularly distributed array partitions. A variety of performance results on the Intel iPSC/860 are presented. >

[1]  Shahid H. Bokhari,et al.  A Partitioning Strategy for PDEs Across Multiprocessors , 1985, ICPP.

[2]  Charles Koelbel,et al.  Supporting shared data structures on distributed memory architectures , 1990, PPOPP '90.

[3]  Joel H. Saltz,et al.  Principles of runtime support for parallel processors , 1988, ICS '88.

[4]  S. Eisenstat,et al.  An experimental study of methods for parallel preconditioned Krylov methods , 1989, C3P.

[5]  Ken Kennedy,et al.  Computer support for machine-independent parallel programming in Fortran D , 1992 .

[6]  Bobby Schnabel,et al.  An Overview of Dino - A New Language for Numerical Computation on Distributed Memory Multiprocessors , 1987, PPSC.

[7]  Dimitri J. Mavriplis,et al.  The design and implementation of a parallel unstructured Euler solver using software primitives , 1992 .

[8]  Scott B. Baden,et al.  Programming Abstractions for Dynamically Partitioning and Coordinating Localized Scientific Calculations Running on Multiprocessors , 1991, SIAM J. Sci. Comput..

[9]  Martin Charles Golumbic,et al.  Instruction Scheduling Across Control Flow , 1993, Sci. Program..

[10]  Joseph W. H. Liu,et al.  Computational models and task scheduling for parallel sparse Cholesky factorization , 1986, Parallel Comput..

[11]  D. Mavriplis Multigrid solution of the two-dimensional Euler equations on unstructured triangular meshes , 1987 .

[12]  Rice UniversityCORPORATE,et al.  High performance Fortran language specification , 1993 .

[13]  Robert P. Weaver,et al.  The DINO Parallel Programming Language , 1991, J. Parallel Distributed Comput..

[14]  Marina C. Chen,et al.  Generating explicit communication from shared-memory program references , 1990, Proceedings SUPERCOMPUTING '90.

[15]  Joel H. Saltz,et al.  Slicing Analysis and Indirect Accesses to Distributed Arrays , 1993, LCPC.

[16]  Joel H. Saltz,et al.  A Scheme for Supporting Automatic Data Migration on Multlcomputers , 1990, Proceedings of the Fifth Distributed Memory Computing Conference, 1990..

[17]  Michael Gerndt,et al.  Updating Distributed Variables in Local Computations , 1990, Concurr. Pract. Exp..

[18]  Marina C. Chen,et al.  Automated Problem Mapping: the Crystal Runtime System. , 1987 .

[19]  Cleve Ashcraft,et al.  A Fan-In Algorithm for Distributed Sparse Numerical Factorization , 1990, SIAM J. Sci. Comput..

[20]  Charles Koelbel,et al.  Compiling Global Name-Space Parallel Loops for Distributed Execution , 1991, IEEE Trans. Parallel Distributed Syst..

[21]  Guy L. Steele,et al.  The High Performance Fortran Handbook , 1993 .

[22]  Ken Kennedy,et al.  Compiler optimizations for Fortran D on MIMD distributed-memory machines , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[23]  Jean-Louis Pazat,et al.  PANDORE: a system to manage data distribution , 1992 .

[24]  Jingke Li,et al.  Index domain alignment: minimizing cost of cross-referencing between distributed arrays , 1990, [1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation.

[25]  Marina Chen,et al.  Automating the Coordination of Interprocessor Communication , 1990 .

[26]  Harry Berryman,et al.  Run-Time Scheduling and Execution of Loops on Message Passing Machines , 1990, J. Parallel Distributed Comput..

[27]  Arthur Rizzi,et al.  Numerical methods for the computation of inviscid transonic flows with shock waves : a GAMM workshop , 1981 .

[28]  Anne Rogers,et al.  Process decomposition through locality of reference , 1989, PLDI '89.

[29]  P.-S. Tseng,et al.  A parallelizing compiler for distributed memory parallel computers , 1989, PLDI 1989.

[30]  Joel H. Saltz,et al.  Parallel preconditioned iterative methods for the compressible Navier-Stokes equations , 1990 .

[31]  R. Walters,et al.  Solution algorithms for the two-dimensional Euler equations on unstructured meshes , 1990 .

[32]  Michael Gerndt,et al.  SUPERB: A tool for semi-automatic MIMD/SIMD parallelization , 1988, Parallel Comput..

[33]  Philip F. Ridler,et al.  Fortran Reference Manual , 1979 .

[34]  Lee-Chung Lu,et al.  Parallelizing Loops with Indirect Array References of Pointers , 1991, LCPC.

[35]  Charles Koelbel,et al.  Compiling global name-space programs for distributed execution , 1990 .

[36]  Peter Brezany,et al.  Vienna Fortran - A Language Specification. Version 1.1 , 1992 .

[37]  Robert B. Schnabel,et al.  Massive Parallelism and Process Contraction in Dino , 1990 .

[38]  Harry Berryman,et al.  Execution time support for adaptive scientific algorithms on distributed memory machines , 1991, Concurr. Pract. Exp..

[39]  Harry Berryman,et al.  Performance Effects of Irregular Communication Patterns on Massively Parallel Multiprocessors , 1991, J. Parallel Distributed Comput..

[40]  David Lee Whitaker Two-dimensional Euler computations on a triangular mesh using an upwind, finite-volume scheme , 1988 .

[41]  Marina C. Chen,et al.  A Design Methodology for Synthesizing Parallel Algorithms and Architectures , 1986, J. Parallel Distributed Comput..

[42]  Charles Howard Koelbel,et al.  Compiling programs for nonshared memory machines , 1991 .

[43]  G. C. Fox,et al.  Solving Problems on Concurrent Processors , 1988 .

[44]  Robert B. Schnabel,et al.  Expressing Complex Parallel Algorithms in DINO , 1989 .