Life after failure

This paper discusses the challenges, strategy and mitigation methods to prolong the operation of a product even after it is either about to have a failure or have experienced actual failures. The focus is mostly on two component types ASICs and memories. The aging and failure mitigation techniques vary by component type. Design for reliability techniques so far have been focused on measuring the impact of failures and how to prevent them as much as possible. The current advances in DFR methodologies are reaching their maximum level of effectiveness, still ASIC and memory failures are occurring impacting service operation of many products in the industry. Today's technologies and their challenges require a different approach to addressing failures, which is the ability to have real time self-healing capability. The methodologies that have been developed in the area of self-healing have shown a promising future in the product overall fault management where failures due to aging ASICs and failing memories can be addressed while the system is in operation. This capability allows equipment providers to maintain their customer's confidence in their ability to design, develop and deliver robust products and allow continuous system operation in the presence of failures. In this paper, we discuss the fundamental and phenomenon of aging, modeling and simulation, correlation between simulation and actual devices, sensor designs, mitigation strategies, impact on system reliability and availability and the reliability prediction.

[1]  Yu Cao,et al.  Node Criticality Computation for Circuit Timing Analysis and Optimization under NBTI Effect , 2008, 9th International Symposium on Quality Electronic Design (isqed 2008).

[2]  Tony Tae-Hyoung Kim,et al.  Impact Analysis of NBTI/PBTI on SRAM V MIN and Design Techniques for Improved SRAM V MIN , 2013 .

[3]  Sachin S. Sapatnekar,et al.  Adaptive techniques for overcoming performance degradation due to aging in digital circuits , 2009, 2009 Asia and South Pacific Design Automation Conference.

[4]  Mark Mohammad Tehranipoor,et al.  Representative Critical Reliability Paths for low-cost and accurate on-chip aging evaluation , 2012, 2012 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[5]  Yu Cao,et al.  Node Criticality Computation for Circuit Timing Analysis and Optimization under NBTI Effect , 2008, ISQED 2008.

[6]  Tony T. Kim,et al.  Impacts of NBTI/PBTI on SRAM VMIN and design techniques for SRAM VMIN improvement , 2011, 2011 International SoC Design Conference.

[7]  Shi-Jie Wen,et al.  New DRAM HCI qualification method emphasizing on repeated memory access , 2010, 2010 IEEE International Integrated Reliability Workshop Final Report.

[8]  Kaushik Roy,et al.  Temporal Performance Degradation under NBTI: Estimation and Design for Improved Reliability of Nanoscale Circuits , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[9]  Sachin S. Sapatnekar,et al.  NBTI-Aware Synthesis of Digital Circuits , 2007, 2007 44th ACM/IEEE Design Automation Conference.

[10]  Yu Cao,et al.  Predictive Modeling of the NBTI Effect for Reliable Design , 2006, IEEE Custom Integrated Circuits Conference 2006.

[11]  Mark Mohammad Tehranipoor,et al.  In-field aging measurement and calibration for power-performance optimization , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).

[12]  Chen Ih-Chin,et al.  The effect of channel hot-carrier stressing on gate-oxide integrity in MOSFETs , 1988 .

[13]  J. Tschanz,et al.  Tunable replica circuits and adaptive voltage-frequency techniques for dynamic voltage, temperature, and aging variation tolerance , 2009, 2009 Symposium on VLSI Circuits.

[14]  Said Hamdioui,et al.  Trends and challenges of SRAM reliability in the nano-scale era , 2010, 5th International Conference on Design & Technology of Integrated Systems in Nanoscale Era.

[15]  João Paulo Teixeira,et al.  Low-sensitivity to process variations aging sensor for automotive safety-critical applications , 2010, 2010 28th VLSI Test Symposium (VTS).

[16]  Mark Mohammad Tehranipoor,et al.  Efficient selection and analysis of critical-reliability paths and gates , 2012, GLSVLSI '12.

[17]  Stephen P. Boyd,et al.  Optimized self-tuning for circuit aging , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[18]  Ming Zhang,et al.  Circuit Failure Prediction and Its Application to Transistor Aging , 2007, 25th IEEE VLSI Test Symposium (VTS'07).

[19]  Mark Mohammad Tehranipoor,et al.  Critical-reliability path identification and delay analysis , 2014, JETC.

[20]  C.H. Kim,et al.  An Analytical Model for Negative Bias Temperature Instability , 2006, 2006 IEEE/ACM International Conference on Computer Aided Design.

[21]  G. Taylor,et al.  Effects of hot-carrier trapping in n- and p-channel MOSFET's , 1983, IEEE Transactions on Electron Devices.

[22]  A. Haggag,et al.  Understanding SRAM High-Temperature-Operating-Life NBTI: Statistics and Permanent vs Recoverable Damage , 2007, 2007 IEEE International Reliability Physics Symposium Proceedings. 45th Annual.

[23]  A. S. Grove,et al.  Characteristics of the Surface‐State Charge (Qss) of Thermally Oxidized Silicon , 1967 .

[24]  K. Yamaguchi,et al.  The impact of bias temperature instability for direct-tunneling ultra-thin gate oxide on MOSFET scaling , 1999, 1999 Symposium on VLSI Technology. Digest of Technical Papers (IEEE Cat. No.99CH36325).

[25]  K. Jeppson,et al.  Negative bias stress of MOS devices at high electric fields and degradation of MNOS devices , 1977 .

[26]  Mark Mohammad Tehranipoor,et al.  Design and Analysis of a Delay Sensor Applicable to Process/Environmental Variations and Aging Measurements , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[27]  Sanghyeon Baeg,et al.  DRAM failure cases under hot-carrier injection , 2011, 18th IEEE International Symposium on the Physical and Failure Analysis of Integrated Circuits (IPFA).

[28]  C.H. Kim,et al.  Silicon Odometer: An On-Chip Reliability Monitor for Measuring Frequency Degradation of Digital Circuits , 2007, 2007 IEEE Symposium on VLSI Circuits.