Automated application-level checkpointing based on live-variable analysis in MPI programs
暂无分享,去创建一个
Xuejun Yang | Jia Jia | Yunfei Du | Hongyi Fu | Zhiyun Wang | Panfeng Wang
[1] E. N. Elnozahy,et al. Checkpointing for peta-scale systems: a look into the future of practical rollback-recovery , 2004, IEEE Transactions on Dependable and Secure Computing.
[2] Xuejun Yang,et al. The Fault Tolerant Parallel Algorithm: the Parallel Recomputing Based Failure Recovery , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).
[3] Daniel Marques,et al. Automated application-level checkpointing of MPI programs , 2003, PPoPP '03.