A Fault-Tolerant Processor Architecture

This paper presents a new architecture for Fault-Tolerant processor design inspired from the DIVA technique. DIVA consists of inserting a checker unit in front of the processor commit stage. The checker unit re-executes both computation and memory/register file reads. Whenever an error is detected, the original DIVA checker which is assumed to be fully reliable fixes the error, then commits results (i.e. writes them to memory/register file), flushes the processor and restarts it at the next instruction. In our Modified DIVA architecture, we no longer assume that the checker is fully reliable. In case of error detection, the processor is flushed and restarted at the erroneous instruction. Therefore our modified architecture is more reliable. In order to increase performance, we protect external memory reads with ECC, our checker unit does not re-execute them and therefore the checker and processor are no longer competing for memory accesses as was the case in original DIVA. We have also extended the application of the DIVA technique to a standard RISC pipelined processor (original DIVA was mainly aimed at Superscalar architectures). These new architectural improvements in comparison to original DIVA are presented in this paper, and VHDL implementation results are reported. Fault injection in VHDL simulations was used to evaluate this new technique.

[1]  Marc Tremblay,et al.  High-Performance Fault-Tolerant VLSI Systems Using Micro Rollback , 1990, IEEE Trans. Computers.

[2]  Donald J. Patterson,et al.  Computer organization and design: the hardware-software interface (appendix a , 1993 .

[3]  Michael Nicolaidis Time redundancy based soft-error tolerance to rescue nanometer technologies , 1999, Proceedings 17th IEEE VLSI Test Symposium (Cat. No.PR00146).

[4]  Shekhar Y. Borkar,et al.  Designing reliable systems from unreliable components: the challenges of transistor variability and degradation , 2005, IEEE Micro.

[5]  Todd M. Austin,et al.  A fault tolerant approach to microprocessor design , 2001, 2001 International Conference on Dependable Systems and Networks.

[6]  Todd M. Austin,et al.  DIVA: a reliable substrate for deep submicron microarchitecture design , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[7]  Wolfgang Rosenstiel,et al.  Error Detection Techniques Applicable in an Architecture Framework and Design Methodology for Autonomic SoCs , 2006, BICC.

[8]  Wolfgang Rosenstiel,et al.  Organic Computing at the System on Chip Level , 2006, 2006 IFIP International Conference on Very Large Scale Integration.

[9]  Trevor Mudge,et al.  Razor: a low-power pipeline based on circuit-level timing speculation , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[10]  Todd M. Austin,et al.  Efficient checker processor design , 2000, Proceedings 33rd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-33 2000.