Global Arrays: a portable "shared-memory" programming model for distributed memory computers

Portability, efficiency and ease of coding are all important considerations in choosing the programming model for a scalable parallel application. The message-passing programming model is widely used because of its portability, yet some applications are too complex to code in it while also trying to maintain a balanced computation load and avoid redundant computations. The shared-memory programming model simplifies coding, but it is not portable and often provides little control over interprocessor data transfer costs. This paper describes a new approach, called Global Arrays (GA) that combines the better features of both other models, leading to both simple coding and efficient execution. The key concept of GA is that it provides a portable interface through which each process in a MIMD parallel program can asynchronously access logical blocks of physically distributed matrices, with no need for explicit cooperation by other processes. We have implemented GA libraries on a variety of computer systems, including the Intel DELTA and Paragon, the IBM SP-1 (all message-passers), the Kendall Square KSR-2 (a nonuniform access shared-memory machine), and networks of Unix workstations. We discuss the design and implementation of these libraries, report their performance, illustrate the use of GA in the context of computational chemistry applications, and describe the use of a GA performance visualization tool.<<ETX>>

[1]  A. Szabó,et al.  Modern quantum chemistry : introduction to advanced electronic structure theory , 1982 .

[2]  J. Almlöf,et al.  Principles for a direct SCF approach to LICAO–MOab‐initio calculations , 1982 .

[3]  Jan Almlöf,et al.  Computational Aspects of Direct SCF and MCSCF Methods , 1984 .

[4]  R. R. Oldehoeft,et al.  HEP SISAL: parallel functional programming , 1985 .

[5]  Nicholas Carriero,et al.  How to write parallel programs , 1990 .

[6]  Nicholas Carriero,et al.  How to write parallel programs - a first course , 1990 .

[7]  Robert J. Harrison,et al.  Portable tools and applications for parallel computers , 1991 .

[8]  Ian T. Foster,et al.  Productive Parallel Programming: The PCN Approach , 1995, Sci. Program..

[9]  M. J. Carlton,et al.  Micro benchmark analysis of the KSR1 , 1993, Supercomputing '93.

[10]  Rice UniversityCORPORATE,et al.  High performance Fortran language specification , 1993 .

[11]  Robert J. Harrison,et al.  A parallel implementation of the COLUMBUS multireference configuration interaction program , 1993 .

[12]  Alistair P. Rendell,et al.  Distributed data parallel coupled‐cluster algorithm: Application to the 2‐hydroxypyridine/2‐pyridone tautomerism , 1993, J. Comput. Chem..

[13]  Dirk Grunwald,et al.  Efficient barriers for distributed shared memory computers , 1994, Proceedings of 8th International Parallel Processing Symposium.

[14]  K. Mani Chandy,et al.  Fortran M: A Language for Modular Parallel Programming , 1995, J. Parallel Distributed Comput..