Enabling MPI interoperability through flexible communication endpoints

The current MPI model defines a one-to-one relationship between MPI processes and MPI ranks. This model captures many use cases effectively, such as one MPI process per core and one MPI process per node. However, this semantic has limited interoperability between MPI and other programming models that use threads within a node. In this paper, we describe an extension to MPI that introduces communication endpoints as a means to relax the one-to-one relationship between processes and threads. Endpoints enable a greater degree interoperability between MPI and other programming models, and we illustrate their potential for additional performance and computation management benefits through the decoupling of ranks from processes.

[1]  Jesper Larsson Träff Compact and Efficient Implementation of the MPI Group Operations , 2010, EuroMPI.

[2]  Torsten Hoefler,et al.  Ownership passing: efficient distributed memory programming on multi-core systems , 2013, PPoPP '13.

[3]  Alan Wagner,et al.  FG-MPI: Fine-grain MPI for multicore and clusters , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[4]  Torsten Hoefler,et al.  Leveraging MPI's One-Sided Communication Interface for Shared-Memory Programming , 2012, EuroMPI.

[5]  William Gropp,et al.  Users guide for mpich, a portable implementation of MPI , 1996 .

[6]  Torsten Hoefler,et al.  NUMA-aware shared-memory collective communication for MPI , 2013, HPDC.

[7]  Laxmikant V. Kale,et al.  Object-Based Adaptive Load Balancing for MPI Programs∗ , 2000 .

[8]  Amith R. Mamidala,et al.  PAMI: A Parallel Active Message Interface for the Blue Gene/Q Supercomputer , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[9]  Wu-chun Feng,et al.  MPI-ACC: An Integrated and Extensible Approach to Data Movement in Accelerator-based Systems , 2012, 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems.

[10]  Rajeev Thakur,et al.  Hybrid parallel programming with MPI and unified parallel C , 2010, Conf. Computing Frontiers.

[11]  Laxmikant V. Kalé,et al.  Adaptive Load Balancing for MPI Programs , 2001, International Conference on Computational Science.

[12]  Sayantan Sur,et al.  MVAPICH2-GPU: optimized GPU to GPU communication for InfiniBand clusters , 2011, Computer Science - Research and Development.

[13]  Mark Bull,et al.  Development of mixed mode MPI / OpenMP applications , 2001, Sci. Program..

[14]  John D. Owens,et al.  Extending MPI to accelerators , 2011, ASBD '11.

[15]  Torsten Hoefler,et al.  MPI + MPI: a new hybrid approach to parallel programming with MPI plus shared memory , 2013, Computing.