Extensible PGAS semantics for C++

The Partitioned Global Address Space model combines the expression of data locality in SPMD applications, which is crucial to achieving good parallel performance, with the relative simplicity of the Distributed Shared Memory model. C++ currently lacks language support for PGAS semantics; however, C++ is an excellent host language for implementing Domain-Specific Embedded Languages (DSELs). Leveraging these capabilities of C++, we have implemented the Partitioned Global Property Map, a DSEL library supporting PGAS semantics, polymorphic partitioned global data structures, and a number of useful extensions. The Partitioned Global Property Map library utilizes template meta-programming to allow direct mapping at compile-time of high-level semantics to efficient underlying implementations. It combines flexible/extensible semantics, high performance, and portability across different low-level communication interfaces to allow PGAS programs to be expressed in C++.

[1]  Andrew Lumsdaine,et al.  PFunc: modern task parallelism for modern high performance computing , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[2]  Feipei Lai,et al.  Adsmith: an efficient object-based distributed shared memory system on PVM , 1996, Proceedings Second International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN'96).

[3]  Dan Bonachea GASNet Specification, v1.1 , 2002 .

[4]  David Gelernter,et al.  Generative communication in Linda , 1985, TOPL.

[5]  Robert W. Numrich,et al.  Co-arrays in the next Fortran Standard , 2005, FORF.

[6]  Donald E. Knuth,et al.  The art of computer programming. Vol.2: Seminumerical algorithms , 1981 .

[7]  P. Geoffray Myrinet express (MX): Is your interconnect smart ? , 2004, Proceedings. Seventh International Conference on High Performance Computing and Grid in Asia Pacific Region, 2004..

[8]  Sandeep Koranne,et al.  Boost C++ Libraries , 2011 .

[9]  Maurice Herlihy,et al.  The Aleph Toolkit: Support for Scalable Distributed Shared Objects , 1999, CANPC.

[10]  Robert J. Harrison,et al.  Performance and experience with LAPI-a new high-performance communication library for the IBM RS/6000 SP , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[11]  Robert J. Harrison,et al.  Global Arrays: a portable "shared-memory" programming model for distributed memory computers , 1994, Proceedings of Supercomputing '94.

[12]  Vivek Sarkar,et al.  X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.

[13]  Bradford L. Chamberlain,et al.  The cascade high productivity language , 2004, Ninth International Workshop on High-Level Parallel Programming Models and Supportive Environments, 2004. Proceedings..

[14]  Douglas P. Gregor,et al.  The Parallel BGL : A Generic Library for Distributed Graph Computations , 2005 .

[15]  Katherine A. Yelick,et al.  Titanium: A High-performance Java Dialect , 1998, Concurr. Pract. Exp..

[16]  Jack Dongarra,et al.  Introduction to the HPCChallenge Benchmark Suite , 2004 .

[17]  Robert W. Numrich,et al.  Co-array Fortran for parallel programming , 1998, FORF.

[18]  Donald E. Knuth The Art of Computer Programming 2 / Seminumerical Algorithms , 1971 .

[19]  Brad Richards,et al.  Java-Based DSM with Object-Level Coherence Protocol Selection , 2003 .

[20]  Bryan Carpenter,et al.  ARMCI: A Portable Remote Memory Copy Libray for Ditributed Array Libraries and Compiler Run-Time Systems , 1999, IPPS/SPDP Workshops.

[21]  Philip Heidelberger,et al.  The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer , 2008, ICS '08.

[22]  Matthew H. Austern Generic programming and the STL - using and extending the C++ standard template library , 1999, Addison-Wesley professional computing series.

[23]  Andrew Lumsdaine,et al.  Lifting sequential graph algorithms for distributed-memory parallel computation , 2005, OOPSLA '05.