Runtime adaptable concurrent error detection for linear digital systems

In response to the rising fault susceptibility of ICs due to aggressive device scaling, a number of concurrent error detection (CED) techniques have been proposed. The existing circuit- or logic- level CED techniques aim at the worst case of fault susceptibility. Recognizing that the energy consumption of the circuitry with different CED capability varies significantly, these techniques could result in significant overhead for today's deep sub-micron devices that suffer from strong variation of fault susceptibility. In this paper, we propose a novel RT-level CED technique for linear digital systems. The proposed technique offers run-time adaptable CED so that devices will never overpay the energy bills for their CED needs.

[1]  James Tschanz,et al.  Parameter variations and impact on circuits and microarchitecture , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[2]  Mahdi Fazeli,et al.  An energy efficient circuit level technique to protect register file from MBUs and SETs in embedded processors , 2009, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks.

[3]  Michael Nicolaidis,et al.  A CAD framework for generating self-checking multipliers based on residue codes , 1999, Design, Automation and Test in Europe Conference and Exhibition, 1999. Proceedings (Cat. No. PR00078).

[4]  Abhijit Chatterjee,et al.  On-line fault detection in DSP circuits using extrapolated checksums with minimal test points , 1999, International Test Conference 1999. Proceedings (IEEE Cat. No.99CH37034).

[5]  Ravishankar K. Iyer,et al.  Recent advances and new avenues in hardware-level reliability support , 2005, IEEE Micro.

[6]  Liyi Xiao,et al.  Soft error optimization of standard cell circuits based on gate sizing and multi-objective genetic algorithm , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[7]  Ramesh Karri,et al.  Algorithm level recomputing using allocation diversity: a registertransfer level approach to time redundancy-based concurrent errordetection , 2002, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[8]  Ahmad ABDELHAY,et al.  Analytical redundancy based approach for concurrent fault detection in linear digital systems , 2000, Proceedings 6th IEEE International On-Line Testing Workshop (Cat. No.PR00646).

[9]  Uwe Sparmann,et al.  On the check base selection problem for fast adders , 1993, Digest of Papers Eleventh Annual 1993 IEEE VLSI Test Symposium.

[10]  B. Narasimham,et al.  Characterization of Digital Single Event Transient Pulse-Widths in 130-nm and 90-nm CMOS Technologies , 2007, IEEE Transactions on Nuclear Science.

[11]  Rong Luo,et al.  Impact of process variation on soft error vulnerability for nanometer VLSI circuits , 2005, 2005 6th International Conference on ASIC.

[12]  Trevor Mudge,et al.  Razor: a low-power pipeline based on circuit-level timing speculation , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[13]  N. Seifert,et al.  Robust system design with built-in soft-error resilience , 2005, Computer.

[14]  S. Niranjan,et al.  A comparison of fault-tolerant state machine architectures for space-borne electronics , 1996, IEEE Trans. Reliab..

[15]  Abhijit Chatterjee,et al.  Design of soft error resilient linear digital filters using checksum-based probabilistic error correction , 2006, 24th IEEE VLSI Test Symposium.

[16]  Yu Liu,et al.  Towards cool and reliable digital systems: RT level CED techniques with runtime adaptability , 2010, 2010 IEEE International Conference on Computer Design.

[17]  Viswanathan Subramanian,et al.  Low overhead Soft Error Mitigation techniques for high-performance and aggressive systems , 2009, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks.

[18]  Abhijit Chatterjee,et al.  Probabilistic Compensation for Digital Filters Using Pervasive Noise-Induced Operator Errors , 2007, 25th IEEE VLSI Test Symposium (VTS'07).

[19]  Abhijit Chatterjee,et al.  The Design of Fault-Tolerant Linear Digital State Variable Systems: Theory and Techniques , 1993, IEEE Trans. Computers.

[20]  Algirdas Avizienis,et al.  Arithmetic Algorithms for Error-Coded Operands , 1973, IEEE Transactions on Computers.

[21]  Janak H. Patel,et al.  Concurrent Error Detection in ALU's by Recomputing with Shifted Operands , 1982, IEEE Transactions on Computers.

[22]  Yiorgos Makris,et al.  Low cost convolutional code based concurrent error detection in FSMs , 2003, Proceedings 18th IEEE Symposium on Defect and Fault Tolerance in VLSI Systems.

[23]  Mikko H. Lipasti,et al.  An accurate flip-flop selection technique for reducing logic SER , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[24]  Naresh R. Shanbhag,et al.  Sequential Element Design With Built-In Soft Error Resilience , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[25]  Stanislaw J. Piestrak,et al.  Design of residue generators and multioperand modular adders using carry-save adders , 1991, [1991] Proceedings 10th IEEE Symposium on Computer Arithmetic.