At-Most-Once Semantics in Asynchronous Shared Memory

At-most-once semantics is one of the standard models for object access in decentralized systems. Accessing an object, such as altering the state of the object by means of direct access, method invocation, or remote procedure call, with at-most-once semantics guarantees that the access is not repeated more-than-once, enabling one to reason about the safety properties of the object. This paper investigates implementations of at-most-once access semantics in a model where a set of such actions is to be performed by a set of failure-prone, asynchronous shared-memory processes. We introduce a definition of the at-most-once problem for performing a set of n jobs using m processors and we introduce a notion of efficiency for such protocols, called effectiveness, used to classify algorithms. Effectiveness measures the number of jobs safely completed by an implementation, as a function of the overall number of jobs n, the number of participating processes m, and the number of process crashes f in the presence of an adversary. We prove a lower bound of n - f on the effectiveness of any algorithm. We then prove that this lower bound can be matched in the two process setting by presenting two algorithms that offer a tradeoff between time and space complexity. Finally, we generalize our two-process solution in the multi-process setting with a hierarchical algorithm that achieves effectiveness of n - logmċo(n), coming reasonably close, asymptotically, to the corresponding lower bound.

[1]  Barbara Liskov,et al.  Distributed programming in Argus , 1988, CACM.

[2]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.

[3]  Prabhakar Ragde,et al.  Parallel Algorithms with Processor Failures and Delays , 1996, J. Algorithms.

[4]  Nancy A. Lynch,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.

[5]  Jan Friso Groote,et al.  An algorithm for the asynchronous Write-All problem based on process collision , 2001, Distributed Computing.

[6]  Santosh K. Shrivastava,et al.  Rajdoot: A Remote Procedure Call Mechanism Supporting Orphan Detection and Killing , 1988, IEEE Trans. Software Eng..

[7]  Andrew Birrell,et al.  Implementing remote procedure calls , 1984, TOCS.

[8]  Nancy A. Lynch,et al.  Correctness of At-Most-Once Message Delivery Protocols , 1993, FORTE.

[9]  Maurice Herlihy,et al.  Wait-free synchronization , 1991, TOPL.

[10]  Dariusz R. Kowalski,et al.  Writing-all deterministically and optimally using a non-trivial number of asynchronous processors , 2004, SPAA '04.

[11]  Brian A. Coan,et al.  Using adaptive timeouts to achieve at-most-once message delivery , 2009, Distributed Computing.

[12]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[13]  Aggelos Kiayias,et al.  Asynchronous Perfectly Secure Communication over One-Time Pads , 2005, ICALP.

[14]  Richard W. Watson,et al.  The Delta-t transport protocol: features and experience , 1989, [1989] Proceedings. 14th Conference on Local Computer Networks.

[15]  Grzegorz Malewicz,et al.  A Work-Optimal Deterministic Algorithm for the Certified Write-All Problem with a Nontrivial Number of Asynchronous Processors , 2005, SIAM J. Comput..

[16]  Alexander A. Shvartsman,et al.  Fault-Tolerant Parallel Computation , 1997 .

[17]  Kwei-Jay Lin,et al.  Atomic Remote Procedure Call , 1985, IEEE Transactions on Software Engineering.

[18]  Liuba Shrira,et al.  Efficient at-most-once messages based on synchronized clocks , 1991, TOCS.

[19]  Nancy A. Lynch,et al.  An introduction to input/output automata , 1989 .

[20]  Grzegorz Malewicz,et al.  A work-optimal deterministic algorithm for the asynchronous certified write-all problem , 2003, PODC '03.

[21]  Nancy A. Lynch,et al.  Modelling shared state in a shared action model , 1990, [1990] Proceedings. Fifth Annual IEEE Symposium on Logic in Computer Science.

[22]  Richard J. Anderson,et al.  Algorithms for the Certified Write-All Problem , 1997, SIAM J. Comput..