TRIX: Low-Skew Pulse Propagation for Fault-Tolerant Hardware

The vast majority of hardware architectures use a carefully timed reference signal to clock their computational logic. However, standard distribution solutions are not fault-tolerant. In this work, we present a simple grid structure as a more reliable clock propagation method and study it by means of simulation experiments. Fault-tolerance is achieved by forwarding clock pulses on arrival of the second of three incoming signals from the previous layer. A key question is how well neighboring grid nodes are synchronized, even without faults. Analyzing the clock skew under typical-case conditions is highly challenging. Because the forwarding mechanism involves taking the median, standard probabilistic tools fail, even when modeling link delays just by unbiased coin flips. Our statistical approach provides substantial evidence that this system performs surprisingly well. Specifically, in an "infinitely wide" grid of height~$H$, the delay at a pre-selected node exhibits a standard deviation of $O(H^{1/4})$ ($\approx 2.7$ link delay uncertainties for $H=2000$) and skew between adjacent nodes of $o(\log \log H)$ ($\approx 0.77$ link delay uncertainties for $H=2000$). We conclude that the proposed system is a very promising clock distribution method. This leads to the open problem of a stochastic explanation of the tight concentration of delays and skews. More generally, we believe that understanding our very simple abstraction of the system is of mathematical interest in its own right.

[1]  Srivatsan Chellappa,et al.  Redundant Skewed Clocking of Pulse-Clocked Latches for Low Power Soft Error Mitigation , 2015, 2015 15th European Conference on Radiation and Its Effects on Components and Systems (RADECS).

[2]  Ricardo Reis,et al.  SET susceptibility estimation of clock tree networks from layout extraction , 2012, 2012 13th Latin American Test Workshop (LATW).

[3]  Fernanda Gusmão de Lima Kastensmidt,et al.  SET Susceptibility Analysis of Clock Tree and Clock Mesh Topologies , 2014, 2014 IEEE Computer Society Annual Symposium on VLSI.

[4]  Christoph Lenzen,et al.  Fault-tolerant algorithms for tick-generation in asynchronous logic , 2011, SSS.

[5]  Thucydides Xanthopoulos,et al.  Clocking in Modern VLSI Systems , 2009 .

[6]  Danny Dolev,et al.  Self-Stabilizing Pulse Synchronization Inspired by Biological Pacemaker Networks , 2003, Self-Stabilizing Systems.

[7]  Sebastiano Vigna,et al.  Scrambled Linear Pseudorandom Number Generators , 2018, ACM Trans. Math. Softw..

[8]  Sylvain Clerc,et al.  Investigating the single-event-transient sensitivity of 65 nm clock trees with heavy ion irradiation and Monte-Carlo simulation , 2016, 2016 IEEE International Reliability Physics Symposium (IRPS).

[9]  Sam Toueg,et al.  Optimal clock synchronization , 1985, PODC '85.

[10]  L. Wissel,et al.  Flip-Flop Upsets From Single-Event-Transients in 65 nm Clock Circuits , 2009, IEEE Transactions on Nuclear Science.

[11]  Christoph Lenzen,et al.  Self-Stabilising Byzantine Clock Synchronisation Is Almost as Easy as Consensus , 2017, DISC.

[12]  Matthias Függer,et al.  HEX: scaling honeycombs is easier than scaling clock trees , 2013, J. Comput. Syst. Sci..

[13]  Eli Upfal,et al.  Probability and Computing: Randomized Algorithms and Probabilistic Analysis , 2005 .

[14]  Jennifer L. Welch,et al.  Self-Stabilizing Clock Synchronization in the Presence of ByzantineFaults ( Preliminary Version ) Shlomi Dolevy , 1995 .

[15]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[16]  Danny Dolev,et al.  Byzantine Self-stabilizing Pulse in a Bounded-Delay Model , 2007, SSS.

[17]  Ricardo Reis,et al.  SET susceptibility analysis in buffered tree clock distribution networks , 2011, 2011 12th European Conference on Radiation and Its Effects on Components and Systems.

[18]  Sylvain Clerc,et al.  28nm FD-SOI technology and design platform for sub-10pJ/cycle and SER-immune 32bits processors , 2015, ESSCIRC Conference 2015 - 41st European Solid-State Circuits Conference (ESSCIRC).

[19]  Anoop Gupta,et al.  Parallel computer architecture - a hardware / software approach , 1998 .

[20]  Danny Dolev,et al.  On the possibility and impossibility of achieving clock synchronization , 1984, STOC '84.

[21]  Nancy A. Lynch,et al.  A new fault-tolerant algorithm for clock synchronization , 1984, PODC '84.

[22]  B. L. Bhuva,et al.  Single-Event Transient Sensitivity Evaluation of Clock Networks at 28-nm CMOS Technology , 2016, IEEE Transactions on Nuclear Science.

[23]  J. Kiefer,et al.  Asymptotic Minimax Character of the Sample Distribution Function and of the Classical Multinomial Estimator , 1956 .