A Concurrent Partial Snapshot Algorithm for Large-Scale and Dynamic Distributed Systems

[1]  Leslie Lamport,et al.  Distributed snapshots: determining global states of distributed systems , 1985, TOCS.

[2]  Ajay D. Kshemkalyani,et al.  Fast and Message-Efficient Global Snapshot Algorithms for Large-Scale Distributed Systems , 2010, IEEE Transactions on Parallel and Distributed Systems.

[3]  Toshimitsu Masuzawa,et al.  Brief Announcement: A Concurrent Partial Snapshot Algorithm for Large-Scale and Dynamic Distributed Systems , 2011, SSS.

[4]  Mukesh Singhal,et al.  Maximal global snapshot with concurrent initiators , 1994, Proceedings of 1994 6th IEEE Symposium on Parallel and Distributed Processing.

[5]  Friedemann Mattern,et al.  Efficient Algorithms for Distributed Snapshots and Global Virtual Time Approximation , 1993, J. Parallel Distributed Comput..

[6]  Bruno Ciciani,et al.  A VP-accordant checkpointing protocol preventing useless checkpoints , 1998, Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281).

[7]  Achour Mostéfaoui,et al.  From static distributed systems to dynamic systems , 2005, 24th IEEE Symposium on Reliable Distributed Systems (SRDS'05).

[8]  RICHARD KOO,et al.  Checkpointing and Rollback-Recovery for Distributed Systems , 1986, IEEE Transactions on Software Engineering.

[9]  Makoto Takizawa,et al.  Checkpoint and rollback in asynchronous distributed systems , 1997, Proceedings of INFOCOM '97.

[10]  Achour Mostéfaoui,et al.  Preventing useless checkpoints in distributed computations , 1997, Proceedings of SRDS'97: 16th IEEE Symposium on Reliable Distributed Systems.

[11]  Ten-Hwang Lai,et al.  On Distributed Snapshots , 1987, Inf. Process. Lett..

[12]  Ajay D. Kshemkalyani,et al.  Distributed Computing: Principles, Algorithms, and Systems , 2008 .

[13]  Tadashi Araragi,et al.  Dynamic snapshot algorithm and partial rollback algorithm for internet agents , 2005 .

[14]  Augusto Ciuffoletti,et al.  A Distributed Domino-Effect free recovery Algorithm , 1984, Symposium on Reliability in Distributed Software and Database Systems.

[15]  Javier García,et al.  Benchmarking of Web Services Plattforms - An Evaluation with the TPC-APP Benchmark , 2006, WEBIST.

[16]  Brian Randell System structure for software fault tolerance , 1975 .

[17]  Nancy A. Lynch,et al.  Global States of a Distributed System , 1982, IEEE Transactions on Software Engineering.

[18]  Friedemann Mattern,et al.  Virtual Time and Global States of Distributed Systems , 2002 .

[19]  S. Venkatesan,et al.  Crash recovery with little overhead , 1991, [1991] Proceedings. 11th International Conference on Distributed Computing Systems.

[20]  Vijay K. Garg,et al.  Scalable algorithms for global snapshots in distributed systems , 2006, ICS '06.

[21]  Vijay K. Garg,et al.  Efficient Algorithms for Global Snapshots in Large Distributed Systems , 2010, IEEE Transactions on Parallel and Distributed Systems.

[22]  Madalene Spezialetti,et al.  Efficient Distributed Snapshots , 1986, ICDCS.