An adaptive checkpointing protocol to bound recovery time with message logging
暂无分享,去创建一个
[1] Jian Xu,et al. Adaptive independent checkpointing for reducing rollback propagation , 1993, Proceedings of 1993 5th IEEE Symposium on Parallel and Distributed Processing.
[2] Nuno Neves,et al. Using time to improve the performance of coordinated checkpointing , 1996, Proceedings of IEEE International Computer Performance and Dependability Symposium.
[3] Luís Moura Silva,et al. System-level versus user-defined checkpointing , 1998, Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281).
[4] Nitin H. Vaidya,et al. Impact of Checkpoint Latency on Overhead Ratio of a Checkpointing Scheme , 1997, IEEE Trans. Computers.
[5] W. Kent Fuchs,et al. PREACHES-portable recovery and checkpointing in heterogeneous systems , 1998, Digest of Papers. Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing (Cat. No.98CB36224).
[6] Erik Seligman,et al. Application Level Fault Tolerance in Heterogenous Networks of Workstations , 1997, J. Parallel Distributed Comput..
[7] Erol Gelenbe,et al. On the Optimum Checkpoint Interval , 1979, JACM.
[8] Jehoshua Bruck,et al. An On-Line Algorithm for Checkpoint Placement , 1997, IEEE Trans. Computers.
[9] Kai Li,et al. Libckpt: Transparent Checkpointing under UNIX , 1995, USENIX.
[10] James S. Plank,et al. Experimental assessment of workstation failures and their impact on checkpointing systems , 1998, Digest of Papers. Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing (Cat. No.98CB36224).
[11] Lorenzo Alvisi,et al. Reasons for a pessimistic or optimistic message logging protocol in MPI uncoordinated failure, recovery , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.
[12] Victor F. Nicola,et al. Comparative Analysis of Different Models of Checkpointing and Recovery , 1990, IEEE Trans. Software Eng..
[13] Jian Xu,et al. Necessary and Sufficient Conditions for Consistent Global Snapshots , 1995, IEEE Trans. Parallel Distributed Syst..
[14] Mukesh Singhal,et al. Using logging and asynchronous checkpointing to implement recoverable distributed shared memory , 1993, Proceedings of 1993 IEEE 12th Symposium on Reliable Distributed Systems.
[15] W. Kent Fuchs,et al. Branch recovery with compiler-assisted multiple instruction retry , 1992, [1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing.
[16] John W. Young,et al. A first order approximation to the optimum checkpoint interval , 1974, CACM.
[17] Yi-Min Wang,et al. Checkpointing and its applications , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.
[18] Golden G. Richard,et al. On patterns for practical fault tolerant software in Java , 1998, Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281).
[19] Jacques Malenfant,et al. Computing Optimal Checkpointing Strategies for Rollback and Recovery Systems , 1988, IEEE Trans. Computers.
[20] W. Kent Fuchs,et al. Message logging in mobile computing , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).
[21] Edward G. Coffman,et al. Optimal strategies for scheduling checkpoints and preventive maintenance , 1990 .
[22] Nian-Feng Tzeng,et al. Logging and recovery in adaptive software distributed shared memory systems , 1999, Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems.
[23] Nuno Neves,et al. Coordinated checkpointing without direct coordination , 1998, Proceedings. IEEE International Computer Performance and Dependability Symposium. IPDS'98 (Cat. No.98TB100248).
[24] W. Kent Fuchs,et al. Compiler‐assisted full checkpointing , 1994, Softw. Pract. Exp..
[25] W. Kent Fuchs,et al. Reduced overhead logging for rollback recovery in distributed shared memory , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.
[26] Robert Geist,et al. Selection of a checkpoint interval in a critical-task environment , 1988 .
[27] Andrzej Duda,et al. The Effects of Checkpointing on Program Execution Time , 1983, Inf. Process. Lett..
[28] Özalp Babaoglu,et al. On the Optimum Checkpoint Selection Problem , 1984, SIAM J. Comput..
[29] Jehoshua Bruck,et al. An on-line algorithm for checkpoint placement , 1996, Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering.
[30] Mukesh Singhal,et al. Low-cost checkpointing with mutable checkpoints in mobile computing systems , 1998, Proceedings. 18th International Conference on Distributed Computing Systems (Cat. No.98CB36183).
[31] William H. Sanders,et al. Performance analysis of two time-based coordinated checkpointing protocols , 1997, Proceedings Pacific Rim International Symposium on Fault-Tolerant Systems.
[32] Jacob A. Abraham,et al. Compiler-assisted static checkpoint insertion , 1992, [1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing.
[33] Micah Beck,et al. Compiler-Assisted Checkpointing , 1994 .
[34] Lorenzo Alvisi,et al. An analysis of communication induced checkpointing , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).
[35] B. Ramkumar,et al. Portable checkpointing for heterogeneous architectures , 1997, Proceedings of IEEE 27th International Symposium on Fault Tolerant Computing.
[36] Nuno Neves,et al. RENEW: a tool for fast and efficient implementation of checkpoint protocols , 1998, Digest of Papers. Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing (Cat. No.98CB36224).