On the Threat of Metastability in an Asynchronous Fault-Tolerant Clock Generation Scheme

Due to their handshake-based flow control, asynchronous circuits generally do not suffer from metastability issues as much as synchronous circuits do. We will show, however, that fault effects like single-event transients can force (sequential) asynchronous building blocks such as Muller C-Elements into a metastable state. At the example of a fault-tolerant clock generation scheme, we will illustrate that metastability could overcome conventional error containment boundaries, and that, ultimately, a single metastable upset could cause even a multiple Byzantine fault-tolerant system to fail. In order to quantify this threat, we performed analytic modeling and simulation of the elastic pipelines, which are at the heart of our physical implementation of the fault-tolerant clocks. Our analysis results reveal that only transient pulses of some very specific width can trigger metastable behavior. So even without consideration of other masking effects the probability of a metastable upset to propagate through a pipeline is fairly small. Still, however, a thorough metastability analysis is mandatory for circuits employed in high-dependability applications.

[1]  C. Dike,et al.  Miller and noise effects in a synchronizing flip-flop , 1999 .

[2]  Régis Leveugle,et al.  Asynchronous circuits transient faults sensitivity evaluation , 2005, Proceedings. 42nd Design Automation Conference, 2005..

[3]  Alexandre Yakovlev,et al.  Low-Cost Online Testing of Asynchronous Handshakes , 2006, Eleventh IEEE European Test Symposium (ETS'06).

[4]  Suwen Yang,et al.  Computing Synchronizer Failure Probabilities , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[5]  Sam Toueg,et al.  Optimal clock synchronization , 1985, PODC '85.

[6]  Ran Ginosar,et al.  Timing measurements of synchronization circuits , 2003, Ninth International Symposium on Asynchronous Circuits and Systems, 2003. Proceedings..

[7]  M.J. Gadlage,et al.  Digital Device Error Rate Trends in Advanced CMOS Technologies , 2006, IEEE Transactions on Nuclear Science.

[8]  Ivan E. Sutherland,et al.  Micropipelines , 1989, Commun. ACM.

[9]  David J. Kinniment,et al.  Synchronization circuit performance , 2002 .

[10]  Leonard R. Marino,et al.  General theory of metastable operation , 1981, IEEE Transactions on Computers.

[11]  Andreas Steininger,et al.  VLSI Implementation of a Fault-Tolerant Distributed Clock Generation , 2006, 2006 21st IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems.

[12]  Teresa H. Meng,et al.  Supply noise and CMOS synchronization errors , 1995 .

[13]  N. Seifert,et al.  Radiation-induced clock jitter and race , 2005, 2005 IEEE International Reliability Physics Symposium, 2005. Proceedings. 43rd Annual..

[14]  Jun Zhou,et al.  Adapting Synchronizers to the Effects of on Chip Variability , 2008, 2008 14th IEEE International Symposium on Asynchronous Circuits and Systems.

[15]  K.K. Das,et al.  CMOS latch metastability characterization at the 65-nm-technology node , 2008, 2008 IEEE International Conference on Microelectronic Test Structures.

[16]  Hendrikus J. M. Veendrick,et al.  The behaviour of flip-flops used as synchronizers and prediction of their failure rate , 1980 .

[17]  Gordon Russell,et al.  Measuring deep metastability , 2006, 12th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC'06).

[18]  近藤 真史,et al.  Globally Asynchronous Locally Synchronous Systemにおける非同期バスの一構成法(計算機システム) , 2007 .

[19]  Leslie Lamport,et al.  Using Time Instead of Timeout for Fault-Tolerant Distributed Systems. , 1984, TOPL.

[20]  Alain J. Martin The limitations to delay-insensitivity in asynchronous circuits , 1990 .

[21]  Kees van Berkel Beware the isochronic fork , 1992, Integr..

[22]  Jens Horstmann,et al.  Metastability behavior of CMOS ASIC flip-flops in theory and test , 1989 .

[23]  Matthias Függer,et al.  Fault-Tolerant Distributed Clock Generation in VLSI Systems-on-Chip , 2006, 2006 Sixth European Dependable Computing Conference.

[24]  Alexandre Yakovlev,et al.  Measuring Deep Metastability and Its Effect on Synchronizer Performance , 2007, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[25]  D. J. Kinniment,et al.  Circuit technology in a large computer system , 1973 .

[26]  Cristian Constantinescu,et al.  Trends and Challenges in VLSI Circuit Reliability , 2003, IEEE Micro.

[27]  Daniel Marcos Chapiro,et al.  Globally-asynchronous locally-synchronous systems , 1985 .

[28]  Antonio Cantoni,et al.  Metastable Behavior in Digital Systems , 1987, IEEE Design & Test of Computers.

[29]  M. Suzuoki,et al.  Overview of the architecture, circuit design, and physical implementation of a first-generation cell processor , 2006, IEEE Journal of Solid-State Circuits.

[30]  Josef Widder Distributed Computing in the Presence of Bounded Asynchrony , 2004 .