Optimal Resilience in Systems that Mix Shared Memory and Message Passing

We investigate the minimal number of failures that can partition a system where processes communicate both through shared memory and by message passing. We prove that this number precisely captures the resilience that can be achieved by algorithms that implement a variety of shared objects, like registers and atomic snapshots, and solve common tasks, like randomized consensus, approximate agreement and renaming. This has implications for the m&m-model of [5] and for the hybrid, cluster-based model of [29, 32]. 2012 ACM Subject Classification Theory of computation → Distributed computing models; Theory of computation → Concurrent algorithms; Computing methodologies → Distributed algorithms

[1]  Xing Hu,et al.  Randomized Consensus with Regular Registers , 2020, ArXiv.

[2]  Maurice Herlihy,et al.  The asynchronous computability theorem for t-resilient tasks , 1993, STOC.

[3]  Hagit Attiya,et al.  Renaming in an asynchronous environment , 1990, JACM.

[4]  Peter Robinson,et al.  Easy impossibility proofs for k-set agreement in message passing systems , 2011, PODC '11.

[5]  Maurice Herlihy,et al.  On the space complexity of randomized synchronization , 1993, PODC '93.

[6]  Wojciech M. Golab,et al.  Linearizable implementations do not suffice for randomized distributed computation , 2011, STOC '11.

[7]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1983, PODS '83.

[8]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[9]  Michael Ben-Or,et al.  Another advantage of free choice (Extended Abstract): Completely asynchronous agreement protocols , 1983, PODC '83.

[10]  Hagit Attiya,et al.  Adaptive and Efficient Algorithms for Lattice Agreement and Renaming , 2002, SIAM J. Comput..

[11]  Michel Raynal,et al.  The weakest failure detector to implement a register in asynchronous systems with hybrid communication , 2013, Theor. Comput. Sci..

[12]  Thomas F. Wenisch,et al.  Disaggregated memory for expansion and sharing in blade servers , 2009, ISCA '09.

[13]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[14]  Danny Dolev,et al.  A partial equivalence between shared-memory and message-passing in an asynchronous fail-stop distributed environment , 1993, Mathematical systems theory.

[15]  Baruch Awerbuch,et al.  Atomic shared register access by asynchronous hardware , 1986, 27th Annual Symposium on Foundations of Computer Science (sfcs 1986).

[16]  Sam Toueg,et al.  Optimal Register Construction in M&M Systems , 2019, OPODIS.

[17]  Marcos K. Aguilera,et al.  Passing Messages while Sharing Memory , 2018, PODC.

[18]  Hagit Attiya,et al.  Putting Strong Linearizability in Context: Preserving Hyperproperties in Programs that Use Concurrent Objects , 2019, DISC.

[19]  Michael E. Saks,et al.  Wait-free k-set agreement is impossible: the topology of public knowledge , 1993, STOC.

[20]  Hagit Attiya,et al.  An adaptive collect algorithm with applications , 2002, Distributed Computing.

[21]  Leslie Lamport,et al.  On Interprocess Communication-Part I: Basic Formalism, Part II: Algorithms , 2016 .

[22]  Jiannong Cao,et al.  One for All and All for One: Scalable Consensus in a Hybrid Communication Model , 2019, 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS).

[23]  Nancy A. Lynch,et al.  Are wait-free algorithms fast? , 1994, JACM.

[24]  Sam Toueg,et al.  Asynchronous consensus and broadcast protocols , 1985, JACM.

[25]  Hagit Attiya,et al.  Sharing memory robustly in message-passing systems , 1990, PODC '90.

[26]  Hagit Attiya,et al.  Atomic snapshots in O(n log n) operations , 1993, PODC '93.

[27]  Soma Chaudhuri,et al.  More Choices Allow More Faults: Set Consensus Problems in Totally Asynchronous Systems , 1993, Inf. Comput..

[28]  Maurice Herlihy,et al.  Fast Randomized Consensus Using Shared Memory , 1990, J. Algorithms.

[29]  Eli Gafni,et al.  Generalized FLP impossibility result for t-resilient asynchronous computations , 1993, STOC.