Foreword to the Special Issue of the Workshop on Exascale MPI (ExaMPI 2017)

The aim of the Workshop on Exascale MPI (ExaMPI 2017), held in conjunction with SC17: The International Conference for High Performance Computing, Networking, Storage and Analysis, was to bring together researchers and developers to present and discuss innovative algorithms and concepts in the Message Passing programming model and to create a forum for open and potentially controversial discussions on the future of MPI in the Exascale era. This special issue includes selected papers from this workshop that include innovative algorithms for collective operations, extensions to MPI, including datacentric models, scheduling/routing to avoid network congestion, fault-tolerant communication, interoperability of MPI and PGAS models, and use of MPI in large-scale simulations. The first paper, titled ‘‘A survey of MPI usage in the US Exascale Computing Project,’’ provides an analysis of the survey that was conducted to understand how MPI is currently used and intended to be used by different applications that are part of the Exascale Computing Project (ECP).1 The results of analysis provide specific recommendations for MPI implementors, tool developers, and the MPI Forum. The next three papers focus on the issue of message matching in MPI and provide different options to address message matching at exascale. The paper titled ‘‘Tail Queues: A Multi-threaded Matching Architecture’’ introduces a novel parallel matching architecture and prototype implementation based on MPICH to improve the performance of message matching.2 The paper titled ‘‘Communication-Aware Message Matching in MPI’’ uses a novel message queue architecture that allocates dedicated message queues based on the frequency of communication between various processes to reduce the queue search time and also reduces memory consumption.3 These performance improvements result in a speedup of 5 times on the FDS application. The next paper, titled ‘‘Hardware MPI Message Matching: Insights into MPI Matching Behavior to Inform Design,’’ explores what hardware features are needed to support efficient message matching through the evaluation of message matching characteristics of major MPI implementations.4 The next two papers consider support for fault tolerance. The first paper, titled ‘‘EReinit: Scalable and Efficient Fault-Tolerance for Bulk-Synchronous MPI Applications,’’ describes a global-restart model to improve the recovery time of applications when dealing with faults.5 The second paper, titled ‘‘The Unexpected Virtue of Almost: Exploiting MPI Collective Operations to Approximately Coordinate Checkpoints,’’ describes an uncoordinated checkpointing mechanism that makes use of collective operations already used in an application to force checkpoints.6 The paper titled ‘‘Optimizing Point-to-Point Communication between Adaptive MPI Endpoints in Shared Memory’’ describes an approach to optimize point-to-point communication in a shared memory environment and hence improve MPI multithreading support.7 The paper titled ‘‘On the Memory Attribution Problem: A Solution and Case Study Using MPI’’ describes a solution to capture and analyze memory usage by an MPI application and the MPI library.8 Lastly, the paper titled ‘‘Twister2: Design of a Big Data Toolkit’’ describes an architecture to support different types of data-intensive applications in a unified framework.9 Overall, these nine papers contribute to the knowledge base and advancement of the Message Passing Interface in diverse and useful ways. While illustrating the staying power of MPI after a quarter century, they also point to areas of opportunity for enhancement at Exascale as well as to new potential application areas.