Trading off transient fault tolerance and power consumption in deep submicron (DSM) VLSI circuits

High fault tolerance for transient faults and low-power consumption are key objectives in the design of critical embedded systems. Systems like smart cards, PDAs, wearable computers, pacemakers, defibrillators, and other electronic gadgets must not only be designed for fault tolerance but also for ultra-low-power consumption due to limited battery life. In this paper, a highly accurate method of estimating fault tolerance in terms of mean time to failure (MTTF) is presented. The estimation is based on circuit-level simulations (HSPICE) and uses a double exponential current-source fault model. Using counters, it is shown that the transient fault tolerance and power dissipation of low-power circuits are at odds and allow for a power fault-tolerance tradeoff. Architecture and circuit level fault tolerance and low-power techniques are used to demonstrate and quantify this tradeoff. Estimates show that incorporation of these techniques results either in a design with an MTTF of 36 years and power consumption of 102 /spl mu/W or a design with an MTTF of 12 years and power consumption of 20 /spl mu/W. Depending on the criticality of the system and the power budget, certain techniques might be preferred over others, resulting in either a more fault tolerant or a lower power design, at the sacrifice of the alternative objective.

[1]  M. Ball,et al.  Effects and detection of intermittent failures in digital systems , 1969, AFIPS '69 (Fall).

[2]  T. May,et al.  Alpha-particle-induced soft errors in dynamic memories , 1979, IEEE Transactions on Electron Devices.

[3]  R. J. McPartland Circuit simulations of alpha-particle-induced soft errors in MOS dynamic RAMs , 1981 .

[4]  G. C. Messenger,et al.  Collection of Charge on Junction Nodes from Ion Tracks , 1982, IEEE Transactions on Nuclear Science.

[5]  John R. Hauser,et al.  Simulation Approach for Modeling Single Event Upsets on Advanced CMOS SRAMS , 1985, IEEE Transactions on Nuclear Science.

[6]  R. K. Treece,et al.  VLSI modeling and design for radiation environments , 1986 .

[7]  Daniel P. Siewiorek,et al.  Effects of transient gate-level faults on program behavior , 1990, [1990] Digest of Papers. Fault-Tolerant Computing: 20th International Symposium.

[8]  Victor Carreño,et al.  A Fault Behavior Model for an Avionic Microprocessor: A Case Study , 1991 .

[9]  Resve Saleh,et al.  Simulation and analysis of transient faults in digital circuits , 1992 .

[10]  Marcus Rimén,et al.  A Study of the Error Behavior of a 32-bit RISC Subjected to Simulated Transient Fault Injection , 1992, Proceedings International Test Conference 1992.

[11]  Anantha P. Chandrakasan,et al.  Low-power CMOS digital design , 1992 .

[12]  Elizabeth M. Rudnick,et al.  A fast and accurate gate-level transient fault simulation environment , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[13]  Janak H. Patel,et al.  A logic-level model for /spl alpha/-particle hits in CMOS circuits , 1993, Proceedings of 1993 IEEE International Conference on Computer Design ICCD'93.

[14]  S. Wender,et al.  Single event phenomena in atmospheric neutron environments , 1993 .

[15]  Chi-Ying Tsui,et al.  Saving power in the control path of embedded processors , 1994, IEEE Design & Test of Computers.

[16]  Janak H. Patel,et al.  Latch design for transient pulse tolerance , 1994, Proceedings 1994 IEEE International Conference on Computer Design: VLSI in Computers and Processors.

[17]  Ravishankar K. Iyer,et al.  Device-level transient fault modeling , 1994, Proceedings of IEEE 24th International Symposium on Fault- Tolerant Computing.

[18]  E. Normand,et al.  Single event upset and charge collection measurements using high energy protons and neutrons , 1994 .

[19]  Ravishankar K. Iyer,et al.  A STATISTICAL LOAD DEPENDENCY MODEL FOR CPU ERRORS AT SLAC , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing, 1995, ' Highlights from Twenty-Five Years'..

[20]  Y. Tosaka,et al.  /spl alpha/-particle-induced soft errors in submicron SOI SRAM , 1995, 1995 Symposium on VLSI Technology. Digest of Technical Papers.

[21]  James L. Walsh,et al.  Field testing for cosmic ray soft errors in semiconductor memories , 1996, IBM J. Res. Dev..

[22]  James L. Walsh,et al.  IBM experiments in soft fails in computer electronics (1978-1994) , 1996, IBM J. Res. Dev..

[23]  Mary Jane Irwin,et al.  Some issues in gray code addressing , 1996, Proceedings of the Sixth Great Lakes Symposium on VLSI.

[24]  James F. Ziegler,et al.  Terrestrial cosmic rays , 1996, IBM J. Res. Dev..

[25]  P. J. Cooper,et al.  The role of thermal and fission neutrons in reactor neutron-induced upsets in commercial SRAMs , 1997 .

[26]  K. Johansson,et al.  In-flight and ground testing of single event upset sensitivity in static RAMs , 1997 .

[27]  E. Normand Extensions of the burst generation rate method for wider application to proton/neutron-induced single event effects , 1998 .

[28]  Vivek De,et al.  Technology and design challenges for low power and high performance [microprocessors] , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).

[29]  N. Cohen,et al.  Soft error considerations for deep-submicron CMOS circuit applications , 1999, International Electron Devices Meeting 1999. Technical Digest (Cat. No.99CH36318).

[30]  Yiannos Manoli,et al.  A segmented gray code for low-power microcontroller address buses , 1999, Proceedings 25th EUROMICRO Conference. Informatics: Theory and Practice for the New Millennium.

[31]  Constantinos E. Goutis,et al.  Fault secure binary counter design , 1999, ICECS'99. Proceedings of ICECS '99. 6th IEEE International Conference on Electronics, Circuits and Systems (Cat. No.99EX357).

[32]  C. Laas Fault Tolerant Computing , 2000 .

[33]  Israel Koren,et al.  Reliability enhancement of analog-to-digital converters (ADCs) , 2001, Proceedings 2001 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems.

[34]  Israel Koren,et al.  Transient fault sensitivity analysis of analog-to-digital converters (ADCs) , 2001, Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems.