Evaluating User-Level Fault Tolerance for MPI Applications
暂无分享,去创建一个
[1] Jack J. Dongarra,et al. FT-MPI: Fault Tolerant MPI, Supporting Dynamic Applications in a Dynamic World , 2000, PVM/MPI.
[2] William Gropp,et al. Fault Tolerance in Message Passing Interface Programs , 2004, Int. J. High Perform. Comput. Appl..
[3] Thomas Hérault,et al. An evaluation of User-Level Failure Mitigation support in MPI , 2012, Computing.
[4] John A. Gunnels,et al. Beyond homogeneous decomposition: scaling long-range forces on Massively Parallel Systems , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[5] Thomas Hérault,et al. A Checkpoint-on-Failure Protocol for Algorithm-Based Recovery in Standard MPI , 2012, Euro-Par.
[6] Laxmikant V. Kalé,et al. CHARM++: a portable concurrent object oriented system based on C++ , 1993, OOPSLA '93.
[7] John A. Gunnels,et al. Simulating solidification in metals at high pressure: The drive to petascale computing , 2006 .
[8] John A. Gunnels,et al. Extending stability beyond CPU millennium: a micron-scale atomistic simulation of Kelvin-Helmholtz instability , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[9] Laura L. Pullum,et al. Software Fault Tolerance Techniques and Implementation , 2001 .
[10] Greg Bronevetsky,et al. Run-Through Stabilization: An MPI Proposal for Process Fault Tolerance , 2011, EuroMPI.
[11] Darius Buntinas. Scalable Distributed Consensus to Support MPI Fault Tolerance , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[12] Michael A. Heroux,et al. Toward Local Failure Local Recovery Resilience Model using MPI-ULFM , 2014, EuroMPI/ASIA.