Of Choices, Failures and Asynchrony: The Many Faces of Set Agreement

Set agreement is a fundamental problem in distributed computing in which processes collectively choose a small subset of values from a larger set of proposals. The impossibility of fault-tolerant set agreement in asynchronous networks is one of the seminal results in distributed computing. In synchronous networks, too, the complexity of set agreement has been a significant research challenge that has now been resolved. Real systems, however, are neither purely synchronous nor purely asynchronous. Rather, they tend to alternate between periods of synchrony and periods of asynchrony. Nothing specific is known about the complexity of set agreement in such a “partially synchronous” setting.In this paper, we address this challenge, presenting the first (asymptotically) tight bound on the complexity of set agreement in such systems. We introduce a novel technique for simulating, in a fault-prone asynchronous shared memory, executions of an asynchronous and failure-prone message-passing system in which some fragments appear synchronous to some processes.We use this simulation technique to derive a lower bound on the round complexity of set agreement in a partially synchronous system by a reduction from asynchronous wait-free set agreement. Specifically, we show that every set agreement protocol requires at least $\lfloor\frac{t}{k}\rfloor + 2$ synchronous rounds to decide. We present an (asymptotically) matching algorithm that relies on a distributed asynchrony detection mechanism to decide as soon as possible during periods of synchrony. From these two results, we derive the size of the minimal window of synchrony needed to solve set agreement.By relating synchronous, asynchronous and partially synchronous environments, our simulation technique is of independent interest. In particular, it allows us to obtain a new lower bound on the complexity of early deciding k-set agreement complementary to that of Gafni et al. (in SIAM J. Comput. 40(1):63–78, 2011), and to re-derive the combinatorial topology lower bound of Guerraoui et al. (in Theor. Comput. Sci. 410(6–7):570–580, 2009) in an algorithmic way.

[1]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[2]  Hagit Attiya,et al.  Atomic snapshots in O(n log n) operations , 1993, PODC '93.

[3]  Soma Chaudhuri,et al.  More Choices Allow More Faults: Set Consensus Problems in Totally Asynchronous Systems , 1993, Inf. Comput..

[4]  Eli Gafni The extended BG-simulation and the characterization of t-resiliency , 2009, STOC '09.

[5]  Rachid Guerraoui,et al.  A Topological Treatment of Early-Deciding Set-Agreement , 2006, OPODIS.

[6]  André Schiper,et al.  The Heard-Of model: computing in distributed systems with benign faults , 2009, Distributed Computing.

[7]  Rachid Guerraoui,et al.  The Complexity of Early Deciding Set Agreement , 2011, SIAM J. Comput..

[8]  Hagit Attiya,et al.  Atomic Snapshots in O(n log n) Operations , 1998, SIAM J. Comput..

[9]  Michael E. Saks,et al.  Wait-free k-set agreement is impossible: the topology of public knowledge , 1993, STOC.

[10]  Maurice Herlihy,et al.  The topological structure of asynchronous computability , 1999, JACM.

[11]  Nir Shavit,et al.  Atomic snapshots of shared memory , 1990, JACM.

[12]  Nancy A. Lynch,et al.  A tight lower bound for k-set agreement , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[13]  Eli Gafni,et al.  Generalized FLP impossibility result for t-resilient asynchronous computations , 1993, STOC.

[14]  Michel Raynal,et al.  Strongly Terminating Early-Stopping k-Set Agreement in Synchronous Systems with General Omission Failures , 2008, Theory of Computing Systems.

[15]  Rachid Guerraoui,et al.  The inherent price of indulgence , 2002, PODC '02.

[16]  Eli Gafni,et al.  Round-by-Round Fault Detectors: Unifying Synchrony and Asynchrony (Extended Abstract). , 1998, PODC 1998.

[17]  Nancy A. Lynch,et al.  Tight bounds for k-set agreement , 2000, J. ACM.

[18]  Idit Keidar,et al.  Timeliness, failure-detectors, and consensus performance , 2006, PODC '06.

[19]  Rachid Guerraoui,et al.  From a static impossibility to an adaptive lower bound: the complexity of early deciding set agreement , 2005, STOC '05.

[20]  Ulrich Schmid,et al.  Booting clock synchronization in partially synchronous systems with hybrid process and link failures , 2007, Distributed Computing.

[21]  Dan Alistarh,et al.  How to Solve Consensus in the Smallest Window of Synchrony , 2008, DISC.

[22]  Nancy A. Lynch,et al.  Consensus in the presence of partial synchrony , 1988, JACM.

[23]  Vinton G. Cerf,et al.  A protocol for packet network intercommunication , 1974, CCRV.

[24]  Martin Biely,et al.  Synchronous consensus under hybrid process and link failures , 2011, Theor. Comput. Sci..

[25]  Eli Gafni,et al.  Structured derivations of consensus algorithms for failure detectors , 1998, PODC '98.