Redundant Execution of HPC Applications with MR-MPI

This paper presents a modular-redundant Message Passing Interface (MPI) solution, MR-MPI, for transparently executing high-performance computing (HPC) applications in a redundant fashion. The presented work addresses the deficiencies of recovery-oriented HPC, i.e., checkpoint/restart to/from a parallel file system, at extreme scale by adding the redundancy approach to the HPC resilience portfolio. It utilizes the MPI performance tool interface, PMPI, to transparently intercept MPI calls from an application and to hide all redundancy-related mechanisms. A redundantly executed application runs with r*m native MPI processes, where r is the number of MPI ranks visible to the application and m is the replication degree. Messages between redundant nodes are replicated. Partial replication for tunable resilience is supported. The performance results clearly show the negative impact of the O(m^m) messages between replicas. For low-level, point-to-point benchmarks, the impact can be as high as the replication degree. For applications, performance highly depends on the actual communication types and counts. On single-core systems, the overhead can be 0% for embarrassingly parallel applications independent of the employed redundancy configuration or up to 70-90% for communication-intensive applications in a dual-redundant configuration. On multi-core systems, the overhead can be significantly higher due to the additional communication contention.

[1]  Christian Engelmann,et al.  The Case for Modular Redundancy in Large-Scale High Performance Computing Systems , 2009 .

[2]  G Bronevetsky,et al.  Scalable I/O Systems via Node-Local Storage: Approaching 1 TB/sec File I/O , 2009 .

[3]  Jason Duell,et al.  Berkeley Lab Checkpoint/Restart (BLCR) for Linux Clusters , 2006 .

[4]  Danny Dolev,et al.  The Transis approach to high availability cluster communication , 1996, CACM.

[5]  Tipp Moseley,et al.  PLR: A Software Approach to Transient Fault Tolerance for Multicore Architectures , 2009, IEEE Transactions on Dependable and Secure Computing.

[6]  Shlomo Weiss,et al.  DDMR: Dynamic and Scalable Dual Modular Redundancy with Short Validation Intervals , 2008, IEEE Computer Architecture Letters.

[7]  D. P. Siemwiorek Architecture of fault-tolerant computers: an historical perspective , 1991 .

[8]  Yun Zhou,et al.  MMPI: A Scalable Fault Tolerance Mechanism for MPI Large Scale Parallel Computing , 2010, 2010 10th IEEE International Conference on Computer and Information Technology.

[9]  Paolo Lisi,et al.  Atopic Dermatitis in Adults , 2011, Dermatitis : contact, atopic, occupational, drug.

[10]  M. Beck,et al.  Allergic contact dermatitis from triclosan in antibacterial handwashes , 2001, Contact dermatitis.

[11]  Rajeev Balasubramonian,et al.  Power Efficient Approaches to Redundant Multithreading , 2007, IEEE Transactions on Parallel and Distributed Systems.

[12]  James H. Laros,et al.  rMPI : increasing fault resiliency in a message-passing environment. , 2011 .

[13]  Jaspal Subhlok,et al.  VolpexMPI: An MPI Library for Execution of Parallel Applications on Volatile Nodes , 2009, PVM/MPI.

[14]  Shubhendu S. Mukherjee,et al.  Detailed design and evaluation of redundant multithreading alternatives , 2002, ISCA.

[15]  Rong Yuan,et al.  Configurable fault-tolerant processor (CFTP) for spacecraft onboard processing , 2004, 2004 IEEE Aerospace Conference Proceedings (IEEE Cat. No.04TH8720).

[16]  Emma Storer,et al.  Severe contact dermatitis as a result of an antiseptic bath oil , 2004, The Australasian journal of dermatology.

[17]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..