Using MPI in high-performance computing services

The Message Passing Interface (MPI) is one of the most portable high-performance computing (HPC) programming models, with platform-optimized implementations typically delivered with new HPC systems. Therefore, for distributed services requiring portable, high-performance, user-level network access, MPI promises to be an attractive alternative to custom network portability layers, platform-specific methods, or portable but less performant interfaces such as BSD sockets. In this paper, we present our experiences in using MPI as a network transport for a large-scale, distributed storage system. We discuss the features of MPI that facilitate adoption as well as challenges and recommendations.

[1]  Thomas Hérault,et al.  An Evaluation of User-Level Failure Mitigation Support in MPI , 2012, EuroMPI.

[2]  Jack J. Dongarra,et al.  Building and Using a Fault-Tolerant MPI Implementation , 2004, Int. J. High Perform. Comput. Appl..

[3]  Zhiwei Xu,et al.  Can MPI Benefit Hadoop and MapReduce Applications? , 2011, 2011 40th International Conference on Parallel Processing Workshops.

[4]  Brian E. Smith,et al.  Evaluation of Remote Memory Access Communication on the IBM Blue Gene/P Supercomputer , 2008, 2008 International Conference on Parallel Processing - Workshops.

[5]  Amith R. Mamidala,et al.  PAMI: A Parallel Active Message Interface for the Blue Gene/Q Supercomputer , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[6]  Dhabaleswar K. Panda,et al.  Implementing efficient MPI on LAPI for IBM RS/6000 SP systems: Experiences and performance evaluation , 1999, Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999.

[7]  Ahmad Afsahi,et al.  An Efficient MPI Message Queue Mechanism for Large-scale Jobs , 2012, 2012 IEEE 18th International Conference on Parallel and Distributed Systems.

[8]  Narayan Desai,et al.  MPI Cluster System Software , 2004, PVM/MPI.

[9]  Robert Latham,et al.  Can MPI Be Used for Persistent Parallel Services? , 2006, PVM/MPI.

[10]  Wei-keng Liao,et al.  Scaling parallel I/O performance through I/O delegate and caching system , 2008, HiPC 2008.

[11]  Jason Duell,et al.  Problems with using MPI 1.1 and 2.0 as compilation targets for parallel language implementations , 2004, Int. J. High Perform. Comput. Netw..

[12]  Greg Bronevetsky,et al.  Run-Through Stabilization: An MPI Proposal for Process Fault Tolerance , 2011, EuroMPI.