Communication-Based Prevention of Non-P-Pattern
暂无分享,去创建一个
Zhibo Wu | Zhan Zhang | Xiao-Zong Yang | De-Cheng Zuo | Yi-Wei Ci | Decheng Zuo | Zhan Zhang | Xiaozong Yang | Y. Ci | Zhibo Wu
[1] Nuno Neves,et al. Coordinated checkpointing without direct coordination , 1998, Proceedings. IEEE International Computer Performance and Dependability Symposium. IPDS'98 (Cat. No.98TB100248).
[2] Richard D. Schlichting,et al. Fail-stop processors: an approach to designing fault-tolerant computing systems , 1983, TOCS.
[3] Willy Zwaenepoel,et al. The performance of consistent checkpointing , 1992, [1992] Proceedings 11th Symposium on Reliable Distributed Systems.
[4] Brian Randell,et al. System structure for software fault tolerance , 1975, IEEE Transactions on Software Engineering.
[5] Lorenzo Alvisi,et al. Causality tracking in causal message-logging protocols , 2002, Distributed Computing.
[6] Achour Mostéfaoui,et al. Communication-based prevention of useless checkpoints in distributed computations , 2000, Distributed Computing.
[7] Jeffrey S. Vetter,et al. Communication characteristics of large-scale scientific applications for contemporary cluster architectures , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.
[8] L. Alvisi,et al. A Survey of Rollback-Recovery Protocols , 2002 .
[9] D. Manivannan,et al. Quasi-Synchronous Checkpointing: Models, Characterization, and Classification , 1999, IEEE Trans. Parallel Distributed Syst..
[10] Chita R. Das,et al. Towards a communication characterization methodology for parallel applications , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.
[11] Lorenzo Alvisi,et al. An analysis of communication induced checkpointing , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).
[12] Jichiang Tsai. On Properties of RDT Communication-Induced Checkpointing Protocols , 2003, IEEE Trans. Parallel Distributed Syst..
[13] W. Kent Fuchs,et al. Checkpoint Space Reclamation for Uncoordinated Checkpointing in Message-Passing Systems , 1995, IEEE Trans. Parallel Distributed Syst..
[14] Brian Randell. System structure for software fault tolerance , 1975 .
[15] Yixin Yang,et al. A Novel Roll-Back Mechanism for Performance Enhancement of Asynchronous Checkpointing and Recovery , 2007, Informatica.
[16] D. Manivannan,et al. A low-overhead recovery technique using quasi-synchronous checkpointing , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.
[17] Roberto Baldoni,et al. An Index-Based Checkpointing Algorithm for Autonomous Distributed Systems , 1999, IEEE Trans. Parallel Distributed Syst..
[18] David J. Lilja,et al. Exploiting multiple heterogeneous networks to reduce communication costs in parallel programs , 1997, Proceedings Sixth Heterogeneous Computing Workshop (HCW'97).
[19] RICHARD KOO,et al. Checkpointing and Rollback-Recovery for Distributed Systems , 1986, IEEE Transactions on Software Engineering.
[20] David J. Lilja,et al. Characterization of Communication Patterns in Message-Passing Parallel Scientific Application Programs , 1998, CANPC.
[21] Islene C. Garcia,et al. Non-Blocking Synchronous Checkpointing Based on Rollback-Dependency Trackability , 2006, 2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06).