Productivity and performance using partitioned global address space languages

Partitioned Global Address Space (PGAS) languages combine the programming convenience of shared memory with the locality and performance control of message passing. One such language, Unified Parallel C (UPC) is an extension of ISO C defined by a consortium that boasts multiple proprietary and open source compilers. Another PGAS language, Titanium, is a dialect of JavaTM designed for high performance scientific computation. In this paper we describe some of the highlights of two related projects, the Titanium project centered at U.C. Berkeley and the UPC project centered at Lawrence Berkeley National Laboratory. Both compilers use a source-to-source strategy that trans-lates the parallel languages to C with calls to a communication layer called GASNet. The result is portable high-performance compilers that run on a large variety of shared and distributed memory multiprocessors. Both projects combine compiler, runtime, and application efforts to demonstrate some of the performance and productivity advantages to these languages.

[1]  Rice UniversityCORPORATE,et al.  High performance Fortran language specification , 1993 .

[2]  Charles E. Leiserson,et al.  Space-efficient scheduling of multithreaded computations , 1993, SIAM J. Comput..

[3]  Laxmikant V. Kalé,et al.  CHARM++: a portable concurrent object oriented system based on C++ , 1993, OOPSLA '93.

[4]  Robert W. Numrich,et al.  Co-array Fortran for parallel programming , 1998, FORF.

[5]  Charles E. Leiserson,et al.  Space-Efficient Scheduling of Multithreaded Computations , 1998, SIAM J. Comput..

[6]  Phillip Colella,et al.  Parallel 3D Adaptive Mesh Refinement in Titanium , 1999, PPSC.

[7]  P. Colella,et al.  A Finite Difference Domain Decomposition Method Using Local Corrections for the Solution of Poisson's Equation , 1999 .

[8]  Phillip Colella,et al.  Implementation of a Multilevel Algorithm for Gas Dynamics in a High-Performance Java Dialect , 1999 .

[9]  Alexander Aiken,et al.  Type systems for distributed data structures , 2000, POPL '00.

[10]  Katherine Yelick,et al.  Titanium Language Reference Manual , 2001 .

[11]  Siu Yau Experience in Using Titanium for Simulation of Immersed Boundary Biological Systems , 2002 .

[12]  Dan Bonachea GASNet Specification, v1.1 , 2002 .

[13]  Tarek A. El-Ghazawi,et al.  UPC Performance and Potential: A NPB Experimental Study , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[14]  C. Tseng,et al.  UPC Implementation of an Unbalanced Tree Search Benchmark , 2003 .

[15]  Katherine Yelick,et al.  UPC Language Specifications V1.1.1 , 2003 .

[16]  Katherine A. Yelick,et al.  Polynomial-Time Algorithms for Enforcing Sequential Consistency in SPMD Programs with Arrays , 2003, LCPC.

[17]  Wei Chen,et al.  Message Strip-Mining Heuristics for High Speed Networks , 2004, VECPAR.

[18]  Hans P. Zima,et al.  The cascade high productivity language , 2004 .

[19]  Katherine A. Yelick,et al.  Communication optimizations for fine-grained UPC applications , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).

[20]  Jimmy Su,et al.  Automatic support for irregular computations in a high-level language , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[21]  Phillip Colella,et al.  Adaptive mesh refinement in Titanium , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[22]  Jimmy Su,et al.  Making Sequential Consistency Practical in Titanium , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[23]  Katherine A. Yelick,et al.  Titanium Performance and Potential: An NPB Experimental Study , 2005, LCPC.

[24]  Sabrina A. Merchant Analysis of a Contractile Torus Simulation in Titanium , 2005 .

[25]  Katherine A. Yelick,et al.  Optimizing bandwidth limited problems using one-sided communication and overlap , 2005, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[26]  Phillip Colella,et al.  Parallel Languages and Compilers: Perspective From the Titanium Experience , 2007, Int. J. High Perform. Comput. Appl..

[27]  C. H. Flood,et al.  The Fortress Language Specification , 2007 .

[28]  Phillip Colella,et al.  An adaptive mesh refinement benchmark for modern parallel programming languages , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[29]  Katherine A. Yelick,et al.  Automatic , 2013, Definitions.