Quo Vadis MPI RMA? Towards a More Efficient Use of MPI One-Sided Communication

The MPI standard has long included one-sided communication abstractions through the MPI Remote Memory Access (RMA) interface. Unfortunately, the MPI RMA chapter in the 4.0 version of the MPI standard still contains both well-known and lesser known short-comings for both implementations and users, which lead to potentially non-optimal usage patterns. In this paper, we identify a set of issues and propose ways for applications to better express anticipated usage of RMA routines, allowing the MPI implementation to better adapt to the application’s needs. In order to increase the flexibility of the RMA interface, we add the capability to duplicate windows, allowing access to the same resources encapsulated by a window using different configurations. In the same vein, we introduce the concept of MPI memory handles, meant to provide life-time guarantees on memory attached to dynamic windows, removing the overhead currently present in using dynamically exposed memory. We will show that our extensions provide improved accumulate latencies, reduced overheads for multi-threaded flushes, and allow for zero overhead dynamic memory window usage.

[1]  James Dinan,et al.  Contexts: A Mechanism for High Throughput Communication in OpenSHMEM , 2014, PGAS.

[2]  James Dinan,et al.  On the Fence: An Offload Approach to Ordering One-Sided Communication , 2015, 2015 9th International Conference on Partitioned Global Address Space Programming Models.

[3]  Karl Fuerlinger,et al.  Recent experiences in using MPI-3 RMA in the DASH PGAS runtime , 2018, HPC Asia Workshops.

[4]  Rajeev Thakur,et al.  Enabling MPI interoperability through flexible communication endpoints , 2013, EuroMPI.

[5]  José Gracia,et al.  Global Task Data-Dependencies in PGAS Applications , 2019, ISC.

[6]  Jens Jägersküpper,et al.  GASPI - A Partitioned Global Address Space Programming Interface , 2012, Facing the Multicore-Challenge.

[7]  George Bosilca,et al.  Using Advanced Vector Extensions AVX-512 for MPI Reductions , 2020, EuroMPI.

[8]  Katherine Yelick,et al.  BCL: A Cross-Platform Distributed Data Structures Library , 2018, ICPP.

[9]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[10]  Alex Brooks,et al.  A Lightweight Communication Runtime for Distributed Graph Analytics , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[11]  Dan Bonachea,et al.  GASNet-EX: A High-Performance, Portable Communication Library for Exascale , 2018, LCPC.

[12]  Ryan J. Marshall,et al.  A large-scale study of MPI usage in open-source HPC applications , 2019, SC.

[13]  Barbara M. Chapman,et al.  Introducing OpenSHMEM: SHMEM for the PGAS community , 2010, PGAS '10.

[14]  Sayantan Sur,et al.  RDMA read based rendezvous protocol for MPI over InfiniBand: design alternatives and benefits , 2006, PPoPP '06.

[15]  Alex Rapaport,et al.  Mpi-2: extensions to the message-passing interface , 1997 .

[16]  Matthias S. Müller,et al.  MPI Detach - Asynchronous Local Completion , 2020, EuroMPI.

[17]  Anthony Skjellum,et al.  Finepoints: Partitioned Multithreaded MPI Communication , 2019, ISC.

[18]  George Bosilca,et al.  Using MPI-3 RMA for Active Messages , 2019, 2019 IEEE/ACM Workshop on Exascale MPI (ExaMPI).

[19]  George Bosilca,et al.  A survey of MPI usage in the US exascale computing project , 2018, Concurr. Comput. Pract. Exp..

[20]  José Gracia,et al.  Fibers are not (P)Threads: The Case for Loose Coupling of Asynchronous Programming Models and MPI Through Continuations , 2020, EuroMPI.

[21]  Torsten Hoefler,et al.  Notified Access: Extending Remote Memory Access Programming Models for Producer-Consumer Synchronization , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[22]  Nathan T. Hjelm,et al.  Give MPI Threading a Fair Chance: A Study of Multithreaded MPI Designs , 2019, 2019 IEEE International Conference on Cluster Computing (CLUSTER).

[23]  Torsten Hoefler,et al.  Efficient MPI Support for Advanced Hybrid Programming Models , 2010, EuroMPI.

[24]  Karl Fürlinger,et al.  DASH: A C++ PGAS Library for Distributed Data Structures and Parallel Algorithms , 2016, 2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS).

[25]  Message P Forum,et al.  MPI: A Message-Passing Interface Standard , 1994 .

[26]  Pierre Lemarinier,et al.  Efficient notifications for MPI one-sided applications , 2019, EuroMPI.

[27]  Ron Brightwell,et al.  RMA-MT: A Benchmark Suite for Assessing MPI Multi-threaded RMA Performance , 2016, 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid).

[28]  Sayantan Sur,et al.  Minimizing the usage of hardware counters for collective communication using triggered operations , 2019, EuroMPI.

[29]  Dirk Pflüger,et al.  From piz daint to the stars: simulation of stellar mergers using high-level abstractions , 2019, SC.

[30]  D. Roweth,et al.  Cray XC ® Series Network , 2012 .