An effective hybrid fault-tolerant architecture for pipelined cores

Increasing vulnerability of transistors and interconnects due to CMOS technology scaling is continuously challenging the reliability of future electronic circuits and systems. Lifetime reliability is gaining attention over performance as a design factor even for lower-end commodity applications. In this paper we propose an effective hybrid fault-tolerant architecture able to deal with permanent and transient faults in combinational parts of pipelined cores. The principle consists in triplicating the combinational logic parts but, unlike TMR, only two copies run in parallel while the third one remains in standby until an error is detected. We have implemented this approach on a MIPS microprocessor as case study. Experiments show that our approach is comparable to TMR in terms of area with a notable power saving and offers a full protection against transient and permanent faults.

[1]  Arun K. Somani,et al.  Low Overhead Soft Error Mitigation Techniques for High-Performance and Aggressive Designs , 2012, IEEE Trans. Computers.

[2]  Alberto L. Sangiovanni-Vincentelli,et al.  Fault-tolerant platforms for automotive safety-critical applications , 2003, CASES '03.

[3]  Arnaud Virazel,et al.  A pseudo-dynamic comparator for error detection in fault tolerant architectures , 2012, 2012 IEEE 30th VLSI Test Symposium (VTS).

[4]  Mona Attariyan,et al.  Low-cost protection for SER upsets and silicon defects , 2007 .

[5]  Amin Ansari,et al.  The StageNet fabric for constructing resilient multicore systems , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[6]  Johan Karlsson,et al.  On latching probability of particle induced transients in combinational networks , 1994, Proceedings of IEEE 24th International Symposium on Fault- Tolerant Computing.

[7]  J. von Neumann,et al.  Probabilistic Logic and the Synthesis of Reliable Organisms from Unreliable Components , 1956 .

[8]  Hajime Shimada,et al.  [2009] A Stage-Level Recovery Scheme in Scalable Pipeline Modules for High Dependability , 2010, 2010 International Workshop on Innovative Architecture for Future Generation High Performance.

[9]  Israel Koren,et al.  Fault-Tolerant Systems , 2007 .

[10]  Ekambaram Balaji,et al.  Modeling ASIC memories in VHDL , 1996, Proceedings EURO-DAC '96. European Design Automation Conference with EURO-VHDL '96 and Exhibition.

[11]  Yu Cao,et al.  Design sensitivity of Single Event Transients in scaled logic circuits , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).

[12]  Josep Torrellas,et al.  ReVive: cost-effective architectural support for rollback recovery in shared-memory multiprocessors , 2002, ISCA.

[13]  Babak Falsafi,et al.  Dual use of superscalar datapath for transient-fault detection and recovery , 2001, MICRO.

[14]  Arnaud Virazel,et al.  A Hybrid Fault Tolerant Architecture for Robustness Improvement of Digital Circuits , 2011, 2011 Asian Test Symposium.

[15]  Lorenzo Alvisi,et al.  Modeling the effect of technology trends on the soft error rate of combinational logic , 2002, Proceedings International Conference on Dependable Systems and Networks.