RAMBO: A Reconfigurable Atomic Memory Service for Dynamic Networks

This paper presents an algorithm that emulates atomic read/write shared objects in a dynamic network setting. To ensure availability and faulttolerance, the objects are replicated. To ensure atomicity, reads and writes are performed using quorum configurations, each of which consists of a set of members plus sets of read-quorums and write-quorums. The algorithm is reconfigurable: the quorum configurations may change during computation, and such changes do not cause violations of atomicity. Any quorum configuration may be installed at any time. The algorithm tolerates processor stopping failure and message loss. The algorithm performs three major tasks, all concurrently: reading and writing objects, introducing new configurations, and "garbage-collecting" obsolete configurations. The algorithm guarantees atomicity for arbitrary patterns of asynchrony and failure. The algorithm satisfies a variety of conditional performance properties, based on timing and failure assumptions. In the "normal case", the latency of read and write operations is at most 8d, where d is the maximum message delay.

[1]  Nancy A. Lynch,et al.  A Dynamic Primary Configuration Group Communication Service , 1999, DISC.

[2]  Nancy A. Lynch,et al.  Robust emulation of shared memory using dynamic quorum-acknowledged broadcasts , 1997, Proceedings of IEEE 27th International Symposium on Fault Tolerant Computing.

[3]  Eli Upfal,et al.  How to share memory in a distributed system , 1984, JACM.

[4]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[5]  Alex A. Shvartsmanz Rambo: A Reconfigurable Atomic Memory Service for Dynamic Networks , 2002 .

[6]  David Peleg,et al.  The Availability of Quorum Systems , 1995, Inf. Comput..

[7]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.

[8]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[9]  Amr El Abbadi,et al.  Maintaining availability in partitioned replicated databases , 1987, ACM Trans. Database Syst..

[10]  David Peleg,et al.  How to be an efficient snoop, or the probe complexity of quorum systems (extended abstract) , 1996, PODC '96.

[11]  M. P. Herlihy REPLICATION METHODS FOR ABSTRACT DATA TYPES , 1984 .

[12]  Michael K. Reiter,et al.  Fault detection for Byzantine quorum systems , 1999, Dependable Computing for Critical Applications 7.

[13]  David Peleg,et al.  How to Be an Efficient Snoop, or the Probe Complexity of Quorum Systems , 2002, SIAM J. Discret. Math..

[14]  Peter M. Musial,et al.  Implementing a reconfigurable atomic memory service for dynamic networks , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[15]  Maurice Herlihy,et al.  Dynamic quorum adjustment for partitioned data , 1987, TODS.

[16]  Nancy A. Lynch,et al.  Specifying and using a partitionable group communication service , 1997, PODC '97.

[17]  Moni Naor,et al.  Scalable and dynamic quorum systems , 2003, PODC.

[18]  Yair Amir,et al.  Evaluating quorum systems over the Internet , 1996, Proceedings of Annual Symposium on Fault Tolerant Computing.

[19]  Mark Bearden,et al.  A fault-tolerant algorithm for decentralized on-line quorum adaptation , 1998, Digest of Papers. Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing (Cat. No.98CB36224).

[20]  R. Guerraoui,et al.  Consensus Service: A Modular Approach For Building Fault-Tolerant Agreement Protocols in Distributed Systems , 1996 .

[21]  David K. Gifford,et al.  Weighted voting for replicated data , 1979, SOSP '79.

[22]  Nancy A. Lynch,et al.  Consensus in the presence of partial synchrony , 1988, JACM.

[23]  Nancy A. Lynch,et al.  Specifying and using a partitionable group communication service , 2001, TOCS.

[24]  Frank B. Schmuck,et al.  Agreeing on Processor Group Membership in Timed Asynchronous Distributed Systems , 1995 .

[25]  Alexander A. Shvartsman,et al.  Graceful quorum reconfiguration in a robust emulation of shared memory , 2000, Proceedings 20th IEEE International Conference on Distributed Computing Systems.

[26]  Moni Naor,et al.  The Load, Capacity, and Availability of Quorum Systems , 1998, SIAM J. Comput..

[27]  Satish K. Tripathi,et al.  A Robust Distributed Mutual Exclusion Algorithm , 1991, WDAG.

[28]  Nancy A. Lynch,et al.  Rambo II: rapidly reconfigurable atomic memory for dynamic networks , 2003, 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..

[29]  Hector Garcia-Molina,et al.  How to assign votes in a distributed system , 1985, JACM.

[30]  Rachid Guerraoui,et al.  Consensus service: a modular approach for building agreement protocols in distributed systems , 1996, Proceedings of Annual Symposium on Fault Tolerant Computing.

[31]  Sushil Jajodia,et al.  Dynamic voting algorithms for maintaining the consistency of a replicated database , 1990, TODS.

[32]  Hector Garcia-Molina,et al.  Consistency in a partitioned network: a survey , 1985, CSUR.

[33]  Idit Keidar,et al.  Efficient message ordering in dynamic networks , 1996, PODC '96.

[34]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[35]  Idit Keidar,et al.  Dynamic voting for consistent primary components , 1997, PODC '97.

[36]  Louise E. Moser,et al.  Robust and Efficient Replication Using Group Communication , 1994 .

[37]  Hagit Attiya,et al.  Sharing memory robustly in message-passing systems , 1990, PODC '90.

[38]  Divyakant Agrawal,et al.  Resilient Logical Structures for Efficient Management of Replicated Data , 1992, VLDB.

[39]  Flaviu Cristian,et al.  An efficient, fault-tolerant protocol for replicated data management , 1985, Fault-Tolerant Distributed Computing.

[40]  Nancy A. Lynch,et al.  Revisiting the PAXOS algorithm , 1997, Theor. Comput. Sci..

[41]  Jacob Beal,et al.  RamboNodes for the Metropolitan Ad Hoc Network , 2003 .

[42]  Seif Haridi,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.