Self-Stabilising Byzantine Clock Synchronisation Is Almost as Easy as Consensus

We give fault-tolerant algorithms for establishing synchrony in distributed systems in which each of the $n$ nodes has its own clock. Our algorithms operate in a very strong fault model: we require self-stabilisation, i.e., the initial state of the system may be arbitrary, and there can be up to $f<n/3$ ongoing Byzantine faults, i.e., nodes that deviate from the protocol in an arbitrary manner. Furthermore, we assume that the local clocks of the nodes may progress at different speeds (clock drift) and communication has bounded delay. In this model, we study the pulse synchronisation problem, where the task is to guarantee that eventually all correct nodes generate well-separated local pulse events (i.e., unlabelled logical clock ticks) in a synchronised manner. Compared to prior work, we achieve exponential improvements in stabilisation time and the number of communicated bits, and give the first sublinear-time algorithm for the problem: - In the deterministic setting, the state-of-the-art solutions stabilise in time $\Theta(f)$ and have each node broadcast $\Theta(f \log f)$ bits per time unit. We exponentially reduce the number of bits broadcasted per time unit to $\Theta(\log f)$ while retaining the same stabilisation time. - In the randomised setting, the state-of-the-art solutions stabilise in time $\Theta(f)$ and have each node broadcast $O(1)$ bits per time unit. We exponentially reduce the stabilisation time to $\log^{O(1)} f$ while each node broadcasts $\log^{O(1)} f$ bits per time unit. These results are obtained by means of a recursive approach reducing the above task of self-stabilising pulse synchronisation in the bounded-delay model to non-self-stabilising binary consensus in the synchronous model. In general, our approach introduces at most logarithmic overheads in terms of stabilisation time and broadcasted bits over the underlying consensus routine.

[1]  Nancy A. Lynch,et al.  Reaching approximate agreement in the presence of faults , 1986, JACM.

[2]  Jennifer L. Welch,et al.  Self-Stabilizing Clock Synchronization in the Presence of ByzantineFaults ( Preliminary Version ) Shlomi Dolevy , 1995 .

[3]  Marcos K. Aguilera,et al.  A Simple Bivalency Proof that t-Resilient Consensus Requires t + 1 Rounds , 1998, Inf. Process. Lett..

[4]  Michael Ben-Or,et al.  Another advantage of free choice (Extended Abstract): Completely asynchronous agreement protocols , 1983, PODC '83.

[5]  Piotr Berman,et al.  Towards optimal distributed consensus , 1989, 30th Annual Symposium on Foundations of Computer Science.

[6]  Michael O. Rabin,et al.  Randomized byzantine generals , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[7]  Danny Dolev,et al.  Byzantine Self-stabilizing Pulse in a Bounded-Delay Model , 2007, SSS.

[8]  Leslie Lamport,et al.  Reaching Agreement in the Presence of Faults , 1980, JACM.

[9]  Alan Fekete,et al.  Asymptotically optimal algorithms for approximate agreement , 1986, PODC '86.

[10]  Christoph Lenzen,et al.  Self-Stabilising Byzantine Clock Synchronisation Is Almost as Easy as Consensus , 2019, J. ACM.

[11]  Danny Dolev,et al.  The Byzantine Generals Strike Again , 1981, J. Algorithms.

[12]  Matthias Függer,et al.  Fault-Tolerant Algorithms for Tick-Generation in Asynchronous Logic: Robust Pulse Generation - [Extended Abstract] , 2011, SSS.

[13]  Danny Dolev,et al.  On the possibility and impossibility of achieving clock synchronization , 1984, STOC '84.

[14]  Jared Saia,et al.  Breaking the O(n2) bit barrier: scalable byzantine agreement with an adaptive adversary , 2010, PODC.

[15]  KingValerie,et al.  Breaking the O(n2) bit barrier , 2011 .

[16]  Christoph Lenzen,et al.  Synchronous counting and computational algorithm design , 2013, J. Comput. Syst. Sci..

[17]  Nancy A. Lynch,et al.  A Lower Bound for the Time to Assure Interactive Consistency , 1982, Inf. Process. Lett..

[18]  Danny Dolev,et al.  Self-Stabilizing Pulse Synchronization Inspired by Biological Pacemaker Networks , 2003, Self-Stabilizing Systems.

[19]  Silvio Micali,et al.  Optimal algorithms for Byzantine agreement , 1988, STOC '88.

[20]  Nancy A. Lynch,et al.  An Upper and Lower Bound for Clock Synchronization , 1984, Inf. Control..

[21]  Jared Saia,et al.  Breaking the O(n2) bit barrier: scalable byzantine agreement with an adaptive adversary , 2010, PODC.

[22]  Nancy A. Lynch,et al.  A new fault-tolerant algorithm for clock synchronization , 1984, PODC '84.

[23]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[24]  Silvio Micali,et al.  An Optimal Probabilistic Algorithm For Synchronous Byzantine Agreement , 1989, ICALP.

[25]  Piotr Berman,et al.  Bit optimal distributed consensus , 1992 .

[26]  Christoph Lenzen,et al.  Towards Optimal Synchronous Counting , 2015, PODC.

[27]  Sam Toueg,et al.  Optimal clock synchronization , 1985, PODC '85.

[28]  Christoph Lenzen,et al.  Self-stabilizing Byzantine Clock Synchronization with Optimal Precision , 2016, SSS.

[29]  Christoph Lenzen,et al.  Near-optimal self-stabilising counting and firing squads , 2018, Distributed Computing.

[30]  Christoph Lenzen,et al.  Efficient Counting with Optimal Resilience , 2015, DISC.

[31]  Edsger W. Dijkstra,et al.  Self-stabilizing systems in spite of distributed control , 1974, CACM.

[32]  P. M. Melliar-Smith,et al.  Synchronizing clocks in the presence of faults , 1985, JACM.

[33]  Matthias Függer,et al.  Efficient Construction of Global Time in SoCs Despite Arbitrary Faults , 2013, 2013 Euromicro Conference on Digital System Design.

[34]  Danny Dolev,et al.  Fast self-stabilizing byzantine tolerant digital clock synchronization , 2008, PODC '08.