Efficient Construction of Global Time in SoCs Despite Arbitrary Faults

In this paper, we show how to build synchronized clocks of arbitrary size atop of existing small-sized clocks, despite arbitrary faults. Our solution is both self-stabilizing and Byzantine fault-tolerant, and needs merely single-bit channels. It involves a reduction to Byzantine fault-tolerant consensus, which allows different consensus algorithms to be plugged in for matching the actual clock sizes and resilience requirements best. We demonstrate the practicability of our approach by means of an FPGA implementation and its experimental evaluation. To also address the cases where deterministic algorithms hit fundamental limits, we provide a novel randomized self-stabilizing Byzantine consensus algorithm that works very well also in these settings, along with its correctness proof and stabilization time analysis.

[1]  Nancy A. Lynch,et al.  A new fault-tolerant algorithm for clock synchronization , 1984, PODC '84.

[2]  Michael Paulitsch,et al.  The transition from asynchronous to synchronous system operation: an approach for distributed fault-tolerant systems , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[3]  Danny Dolev,et al.  Self-Stabilizing Byzantine Pulse Synchronization , 2006, ArXiv.

[4]  Alberto Bartoli,et al.  Online reconfiguration in replicated databases based on group communication , 2001, 2001 International Conference on Dependable Systems and Networks.

[5]  Jennifer L. Welch,et al.  Self-Stabilizing Clock Synchronization in the Presence of ByzantineFaults ( Preliminary Version ) Shlomi Dolevy , 1995 .

[6]  L. Alvisi,et al.  A Survey of Rollback-Recovery Protocols , 2002 .

[7]  Piotr Berman,et al.  Bit optimal distributed consensus , 1992 .

[8]  Leslie Lamport,et al.  Reaching Agreement in the Presence of Faults , 1980, JACM.

[9]  Chris J. Walter,et al.  Clock synchronization in MAFT , 1989, [1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[10]  David J. Kinniment,et al.  Synchronization circuit performance , 2002 .

[11]  Sam Toueg,et al.  Optimal clock synchronization , 1985, PODC '85.

[12]  Matthias Függer,et al.  Fault-Tolerant Algorithms for Tick-Generation in Asynchronous Logic: Robust Pulse Generation - [Extended Abstract] , 2011, SSS.

[13]  Leslie Lamport,et al.  Using Time Instead of Timeout for Fault-Tolerant Distributed Systems. , 1984, TOPL.

[14]  Ulrich Schmid,et al.  Booting clock synchronization in partially synchronous systems with hybrid process and link failures , 2007, Distributed Computing.

[15]  Kees G. W. Goossens,et al.  Aelite: A flit-synchronous Network on Chip with composable and predictable services , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[16]  Danny Dolev,et al.  Self-Stabilizing Pulse Synchronization Inspired by Biological Pacemaker Networks , 2003, Self-Stabilizing Systems.

[17]  Danny Dolev,et al.  Polynomial algorithms for multiple processor agreement , 1982, STOC '82.

[18]  Andreas Steininger,et al.  On the Threat of Metastability in an Asynchronous Fault-Tolerant Clock Generation Scheme , 2009, 2009 15th IEEE Symposium on Asynchronous Circuits and Systems.

[19]  Roman Obermaisser,et al.  The time-triggered System-on-a-Chip architecture , 2008, 2008 IEEE International Symposium on Industrial Electronics.

[20]  R.C. Baumann,et al.  Radiation-induced soft errors in advanced semiconductor technologies , 2005, IEEE Transactions on Device and Materials Reliability.

[21]  Leonard R. Marino,et al.  General theory of metastable operation , 1981, IEEE Transactions on Computers.

[22]  Mahyar R. Malekpour,et al.  A Byzantine-Fault Tolerant Self-stabilizing Protocol for Distributed Clock Synchronization Systems , 2006, SSS.

[23]  Danny Dolev,et al.  Byzantine Self-stabilizing Pulse in a Bounded-Delay Model , 2007, SSS.

[24]  Piotr Berman,et al.  Asymptotically Optimal Distributed Consensus (Extended Abstract) , 1989, ICALP.

[25]  Christoph Lenzen,et al.  Fault-tolerant algorithms for tick-generation in asynchronous logic , 2011, SSS.

[26]  Mahyar R. Malekpour A Self-Stabilizing Byzantine-Fault-Tolerant Clock Synchronization Protocol , 2009 .

[27]  S Miner Paul,et al.  Verification of Fault-Tolerant Clock Synchronization Systems , 2003 .