A Simple Snapshot Algorithm for Multicore Systems

An atomic snapshot object is an object that can be concurrently accessed by n asynchronous processes prone to crash. It is made of m components (base atomic registers) and is defined by two operations: an update operation that allows a process to atomically assign a new value to a component and a snapshot operation that atomically reads and returns the values of all the components. To cope with the net effect of concurrency, a synchrony and failures, the algorithm implementing the update operation has to help concurrent snapshot operations in order they can always terminate. This paper presents a new and particularly simple construction of a snapshot object. This construction relies on a new principle, that we call "write first, help later'' strategy. This strategy directs an update operation first to write its value and only then computes an helping snapshot value that can be used by a snapshot operation in order to terminate. Interestingly, not only the algorithms implementing the snapshot and update operations are simple and have easy proofs, but they are also efficient in terms of the number of accesses to the underlying atomic registers shared by the processes. An operation costs O(m) in the best case and O(n\times m) in the worst case.

[1]  Rachid Guerraoui,et al.  From Unreliable Objects to Reliable Objects: The Case of Atomic Registers and Consensus , 2007, PaCT.

[2]  Leslie Lamport,et al.  Distributed snapshots: determining global states of distributed systems , 1985, TOCS.

[3]  Seif Haridi,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.

[4]  Nir Shavit,et al.  Atomic snapshots of shared memory , 1990, JACM.

[5]  Michel Raynal Synchronization is Coming Back,  But is it the Same? , 2008, 22nd International Conference on Advanced Information Networking and Applications (aina 2008).

[6]  Gadi Taubenfeld Synchronization Algorithms and Concurrent Programming , 2006 .

[7]  Rachid Guerraoui,et al.  Partial snapshot objects , 2008, SPAA '08.

[8]  Hagit Attiya,et al.  Atomic snapshots in O(n log n) operations , 1993, PODC '93.

[9]  James H. Anderson,et al.  Multi-writer composite registers , 1994, Distributed Computing.

[10]  Maurice Herlihy,et al.  Wait-free synchronization , 1991, TOPL.

[11]  Faith E. Fich How hard is it to take a snapshot , 2005 .

[12]  Marcin Paprzycki,et al.  Distributed Computing: Fundamentals, Simulations and Advanced Topics , 2001, Scalable Comput. Pract. Exp..

[13]  Marcos K. Aguilera,et al.  A pleasant stroll through the land of infinitely many creatures , 2004, SIGA.

[14]  Wei Chen,et al.  Linear-Time Snapshot Using Multi-writer Multi-reader Registers , 1994, WDAG.

[15]  Faith Ellen How Hard Is It to Take a Snapshot? , 2005, SOFSEM.

[16]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[17]  Maurice Herlihy,et al.  Distributed computing and the multicore revolution , 2008, SIGA.

[18]  Michel Raynal,et al.  Help When Needed, But No More: Efficient Read/Write Partial Snapshot , 2009, DISC.