RUNTIME SUPPORT AND COMPILATION METHODS FOR USER-SPECIFIED DATE DISTRIBUTIONS

This paper describes two new ideas by which an HPF compiler can deal with irregular computations effectively. The first mechanism invokes a user specified mapping procedure via a set of compiler directives. The directives allow use of program arrays to describe graph connectivity, spatial location of array elements and computational load. The second mechanism is a simple conservative method that in many cases enables a compiler to recognize that it is possible to reuse previously computed information from inspectors (e.g. communication schedules, loop iteration partitions, information that associates off-processor data copies with on-processor buffer locations). We present performance results for these mechanisms from a Fortran 90D compiler implementation.

[1]  Katherine A. Yelick,et al.  Implementing an irregular application on a distributed memory multiprocessor , 1993, PPOPP '93.

[2]  Bruce Hendrickson,et al.  An Improved Spectral Graph Partitioning Algorithm for Mapping Parallel Computations , 1995, SIAM J. Sci. Comput..

[3]  Barbara M. Chapman,et al.  Programming in Vienna Fortran , 1992, Sci. Program..

[4]  Ken Kennedy,et al.  Compiler Analysis for Irregular Problems in Fortran D , 1992, LCPC.

[5]  Harry Berryman,et al.  Runtime Compilation Methods for Multicomputers , 1991, ICPP.

[6]  Nashat Mansour,et al.  Physical optimization algorithms for mapping data to distributed-memory multiprocessors , 1992 .

[7]  Ken Kennedy,et al.  Fortran D Language Specification , 1990 .

[8]  Scott B. Baden,et al.  Programming Abstractions for Dynamically Partitioning and Coordinating Localized Scientific Calculations Running on Multiprocessors , 1991, SIAM J. Sci. Comput..

[9]  D LamMonica,et al.  The cache performance and optimizations of blocked algorithms , 1991 .

[10]  Geoffrey C. Fox,et al.  Compiling Fortran 90D/HPF for Distributed Memory MIMD Computers , 1994, J. Parallel Distributed Comput..

[11]  Shahid H. Bokhari,et al.  A Partitioning Strategy for Nonuniform Problems on Multiprocessors , 1987, IEEE Transactions on Computers.

[12]  Horst D. Simon,et al.  Fast multilevel implementation of recursive spectral bisection for partitioning unstructured problems , 1994, Concurr. Pract. Exp..

[13]  Charles Koelbel,et al.  Supporting shared data structures on distributed memory architectures , 1990, PPOPP '90.

[14]  Horst D. Simon,et al.  Partitioning of unstructured problems for parallel processing , 1991 .

[15]  Brian W. Kernighan,et al.  An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..

[16]  Dimitri J. Mavriplis,et al.  Adaptive mesh generation for viscous flows using delaunay triangulation , 1990 .

[17]  Reinhard von Hanxleden,et al.  Parallelization Strategies for a Molecular Dynamics Program , 1992 .

[18]  Teunis J. Ott,et al.  Load-balancing heuristics and process behavior , 1986, SIGMETRICS '86/PERFORMANCE '86.

[19]  Ken Kennedy,et al.  Value-Based Distributions in Fortran D: A Preliminary Report , 1993 .

[20]  Joel H. Saltz,et al.  Distributed memory compiler methods for irregular problems—data copy reuse and runtime partitioning , 1992 .

[21]  B. Nour-Omid,et al.  Solving finite element equations on concurrent computers , 1987 .

[22]  A Jameson,et al.  CALCULATION OF IN VISCID TRANSONIC FLOW OVER A COMPLETE AIRCRAFT , 1986 .

[23]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[24]  Lee-Chung Lu,et al.  Parallelizing Loops with Indirect Array References of Pointers , 1991, LCPC.

[25]  Ken Kennedy,et al.  Software support for irregular and loosely synchronous problems , 1992 .

[26]  Harry Berryman,et al.  Multiprocessors and run-time compilation , 1991, Concurr. Pract. Exp..

[27]  Monica S. Lam,et al.  The cache performance and optimizations of blocked algorithms , 1991, ASPLOS IV.

[28]  Rice UniversityCORPORATE,et al.  High performance Fortran language specification , 1993 .

[29]  D WilliamsRoy Performance of dynamic load balancing algorithms for unstructured mesh calculations , 1991 .

[30]  Harry Berryman,et al.  Execution time support for adaptive scientific algorithms on distributed memory machines , 1991, Concurr. Pract. Exp..

[31]  Dimitri J. Mavriplis Three dimensional unstructured multigrid for the Euler equations , 1991 .

[32]  Reinhard von Hanxleden,et al.  Compiler support for machine-independent parallelization of irregular problems , 1994, Rice COMP TR.

[33]  Robert P. Weaver,et al.  The DINO Parallel Programming Language , 1991, J. Parallel Distributed Comput..

[34]  Joel H. Saltz,et al.  Principles of runtime support for parallel processors , 1988, ICS '88.

[35]  Reinhard von Hanxleden,et al.  Load Balancing on Message Passing Architectures , 1991, J. Parallel Distributed Comput..

[36]  Ken Kennedy,et al.  Computer support for machine-independent parallel programming in Fortran D , 1992 .

[37]  Joel H. Saltz,et al.  Parallel preconditioned iterative methods for the compressible Navier-Stokes equations , 1990 .