Extending an Application-Level Checkpointing Tool to Provide Fault Tolerance Support to OpenMP Applications
暂无分享,去创建一个
[1] Gabriel Rodríguez,et al. A Heuristic Approach for the Automatic Insertion of Checkpoints in Message-Passing Codes , 2009, J. Univers. Comput. Sci..
[2] Gabriel Rodríguez,et al. Compiler-assisted checkpointing of message-passing applications in heterogeneous environments , 2008 .
[3] Milo M. K. Martin,et al. SafetyNet: improving the availability of shared memory multiprocessors with global checkpoint/recovery , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[4] Kai Li,et al. CLIP: A Checkpointing Tool for Message Passing Parallel Programs , 1997, ACM/IEEE SC 1997 Conference (SC'97).
[5] Heon Young Yeom,et al. MPICH-GF: Transparent Checkpointing and Rollback-Recovery for Grid-Enabled MPI Processes , 2004, IEICE Trans. Inf. Syst..
[6] B. Bouteiller,et al. MPICH-V2: a Fault Tolerant MPI for Volatile Nodes based on Pessimistic Sender Based Message Logging , 2003, ACM/IEEE SC 2003 Conference (SC'03).
[7] Gene Cooperman,et al. DMTCP: Transparent checkpointing for cluster computations and the desktop , 2007, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[8] Erik Seligman,et al. Dome: Distributed Object Migration Environment , 1994 .
[9] Gabriel Rodríguez,et al. Analysis of Performance-impacting Factors on Checkpointing Frameworks: The CPPC Case Study , 2011, Comput. J..
[10] Peter K. Szwed,et al. Application-level checkpointing for shared memory programs , 2004, ASPLOS XI.
[11] Sunil Ahn,et al. PC/MPI: Desing and Implementation of a Portable MPI Checkpointer , 2003, PVM/MPI.
[12] Keshav Pingali,et al. Experimental evaluation of application-level checkpointing for OpenMP programs , 2006, ICS '06.
[13] Mohamed Shawky,et al. Using Dynamic Task Level Redundancy for OpenMP Fault Tolerance , 2012, ARCS.
[14] Georg Stellner,et al. CoCheck: checkpointing and process migration for MPI , 1996, Proceedings of International Conference on Parallel Processing.
[15] William R. Dieter,et al. A user-level checkpointing library for POSIX threads programs , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).
[16] Josep Torrellas,et al. ReVive: cost-effective architectural support for rollback recovery in shared-memory multiprocessors , 2002, ISCA.
[17] Roberto R. Osorio,et al. Improving Scalability of Application-Level Checkpoint-Recovery by Reducing Checkpoint Sizes , 2013, New Generation Computing.
[18] Gabriel Rodríguez,et al. CPPC: a compiler‐assisted tool for portable checkpointing of message‐passing applications , 2010, Concurr. Comput. Pract. Exp..
[19] W. Walker,et al. Mpi: a Standard Message Passing Interface 1 Mpi: a Standard Message Passing Interface , 1996 .
[20] Yan Ding,et al. Using Redundant Threads for Fault Tolerance of OpenMP Programs , 2010, 2010 International Conference on Information Science and Applications.