A Scalable Communication-Induced Checkpointing Algorithm for Distributed Systems
暂无分享,去创建一个
Khalil Drira | Pilar Gómez-Gil | Saúl E. Pomares Hernández | Alberto Calixto Simon | Jose Roberto Perez Cruz
[1] Achour Mostéfaoui,et al. Communication-based prevention of useless checkpoints in distributed computations , 2000, Distributed Computing.
[2] Michel Raynal,et al. Tracking immediate predecessors in distributed computations , 2002, SPAA '02.
[3] Saul E. Pomares Hernandez,et al. The Immediate Dependency Relation: An Optimal Way to Ensure Causal Group Communication , 2003 .
[4] Lorenzo Alvisi,et al. An analysis of communication induced checkpointing , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).
[5] Leslie Lamport,et al. Time, clocks, and the ordering of events in a distributed system , 1978, CACM.
[6] Jian Xu,et al. Necessary and Sufficient Conditions for Consistent Global Snapshots , 1995, IEEE Trans. Parallel Distributed Syst..
[7] Jenn-Wei Lin,et al. On the fully-informed communication-induced checkpointing protocol , 2005, 11th Pacific Rim International Symposium on Dependable Computing (PRDC'05).
[8] D. Manivannan,et al. FINE: A Fully Informed aNd Efficient communication-induced checkpointing protocol for distributed systems , 2009, J. Parallel Distributed Comput..
[9] Brian Randell. System structure for software fault tolerance , 1975 .