Self-Stabilizing Byzantine Clock Synchronization with Optimal Precision

AbstractIn the Byzantine-tolerant clock synchronization problem, the goal is to synchronize the clocks of n fully connected nodes. The clocks run at rates between 1 and 𝜗 > 1, and messages have a delay (including computation) between d − U and d. Moreover, up to f < n/3 of the nodes can fail by deviating arbitrarily from the protocol, i.e., are Byzantine. Despite this interference, correct nodes need to generate distinguished events (or pulses) almost simultaneously and periodically. The quality of the solution is measured by the skew, which is the maximum real time difference between corresponding pulses. In the self-stabilizing setting, in addition we allow for transient failures, possibly of all nodes. Once transient faults have ceased and at most f nodes remain faulty, the system should start generating synchronized pulses again. We design a self-stabilizing solution to this problem with asymptotically optimal skew. We achieve our goal by refining and extending the protocol of Lynch and Welch and make the following contributions in the process. We give a simple analysis of the Lynch and Welch protocol with improved bounds on skew and tolerable difference in clock rates by rebuilding upon the main ingredient of their protocol, called approximate agreement.We give a modified version of the protocol so that the frequency and amount of communication between the nodes is reduced. The modification adds a step to adjust the clock rates by another application of approximate agreement. The skew bound achieved is asymptotically optimal for suitable choices of parameters.We present a method to add self-stabilization to the above protocols while preserving their skew bounds. The heart of the method is a coupling scheme that leverages a self-stabilizing protocol with a larger skew.

[1]  Matthias Függer,et al.  Reconciling fault-tolerant distributed computing and systems-on-chip , 2011, Distributed Computing.

[2]  Florian Huemer,et al.  Fault-Tolerant Clock Synchronization with High Precision , 2016, 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI).

[3]  Andreas Steininger,et al.  Safely Stimulating the Clock Synchronization Algorithm in Time-Triggered Systems–A Combined Formal and Experimental Approach , 2009, IEEE Transactions on Industrial Informatics.

[4]  Nancy A. Lynch,et al.  Reaching approximate agreement in the presence of faults , 1986, JACM.

[5]  Hermann Kopetz,et al.  The time-triggered architecture , 1998, Proceedings First International Symposium on Object-Oriented Real-Time Distributed Computing (ISORC '98).

[6]  Sam Toueg,et al.  Optimal clock synchronization , 1985, PODC '85.

[7]  Klaus Schossmaier,et al.  An algorithm for fault-tolerant clock state & rate synchronization , 1999, Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems.

[8]  Christoph Lenzen,et al.  Self-Stabilising Byzantine Clock Synchronisation Is Almost as Easy as Consensus , 2019, J. ACM.

[9]  Danny Dolev,et al.  Self-stabilizing byzantine agreement , 2006, PODC '06.

[10]  Nancy A. Lynch,et al.  An Upper and Lower Bound for Clock Synchronization , 1984, Inf. Control..

[11]  Jennifer L. Welch,et al.  Self-Stabilizing Clock Synchronization in the Presence of ByzantineFaults ( Preliminary Version ) Shlomi Dolevy , 1995 .

[12]  Nancy A. Lynch,et al.  A new fault-tolerant algorithm for clock synchronization , 1984, PODC '84.

[13]  Danny Dolev,et al.  Self-Stabilizing Byzantine Pulse Synchronization , 2006, ArXiv.

[14]  Danny Dolev,et al.  On the possibility and impossibility of achieving clock synchronization , 1984, STOC '84.

[15]  Andreas Steininger,et al.  Rigorously modeling self-stabilizing fault-tolerant circuits: An ultra-robust clocking scheme for systems-on-chip☆ , 2014, J. Comput. Syst. Sci..

[16]  Ulrich Schmid,et al.  Interval-based Clock Synchronization , 1997, Real-Time Systems.

[17]  Christoph Lenzen,et al.  Fault-tolerant algorithms for tick-generation in asynchronous logic , 2011, SSS.