A Newcomer In The PGAS World - UPC++ vs UPC: A Comparative Study

A newcomer in the Partitioned Global Address Space (PGAS) ’world’ has arrived in its version 1.0: Unified Parallel C++ (UPC++). UPC++ targets distributed data structures where communication is irregular or fine-grained. The key abstractions are global pointers, asynchronous programming via RPC, futures and promises. UPC++ API for moving non-contiguous data and handling memories with different optimal access methods resemble those used in modern C++. In this study we provide two kernels implemented in UPC++: a sparse-matrix vector multiplication (SpMV) as part of a PartialDifferential Equation solver, and an implementation of the Heat Equation on a 2D-domain. Code listings of these two kernels are available in the article in order to show the differences in programming style between UPC and UPC++. We provide a performance comparison between UPC and UPC++ using single-node, multi-node hardware and many-core hardware (Intel Xeon Phi Knight’s Landing).

[1]  Stefan Marr,et al.  Partitioned Global Address Space Languages , 2015, ACM Comput. Surv..

[2]  Scott B. Baden,et al.  UPC++: A High-Performance Communication Framework for Asynchronous Computation , 2019, 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[3]  D. Culler,et al.  Parallel programming in Split-C , 1993, Supercomputing '93. Proceedings.

[4]  Robert W. Numrich,et al.  Co-array Fortran for parallel programming , 1998, FORF.

[5]  Katherine A. Yelick,et al.  UPC++: A PGAS Extension for C++ , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[6]  George Almási PGAS (Partitioned Global Address Space) Languages , 2011, Encyclopedia of Parallel Computing.

[7]  A. J. Davies,et al.  The Domain Decomposition Boundary ElementMethod On A Network Of Transputers , 1970 .

[8]  Dan Bonachea,et al.  GASNet-EX: A High-Performance, Portable Communication Library for Exascale , 2018, LCPC.

[9]  Xing Cai,et al.  Porting Tissue-Scale Cardiac Simulations to the Knights Landing Platform , 2017, ISC Workshops.

[10]  Xing Cai,et al.  On the performance and energy efficiency of the PGAS programming model on multicore architectures , 2016, 2016 International Conference on High Performance Computing & Simulation (HPCS).

[11]  Juan Touriño,et al.  Performance Evaluation of MPI, UPC and OpenMP on Multicore Architectures , 2009, PVM/MPI.

[12]  Scott B. Baden,et al.  MATE, a Unified Model for Communication-Tolerant Scientific Applications , 2018, LCPC.

[13]  Reza Rooholamini,et al.  An Empirical Study of Hyper-Threading in High-Performance Computing Clusters , 2002 .

[14]  Katherine Yelick,et al.  Introduction to UPC and Language Specification , 2000 .

[15]  Chapel : Cascade High-Productivity Language An Overview of the Chapel Parallel Programming Model ∗ , 2012 .

[16]  Xing Cai,et al.  Performance Optimization and Modeling of Fine-Grained Irregular Communication in UPC , 2019, Sci. Program..