Globally Optimized Robust Systems to Overcome Scaled CMOS Reliability Challenges

Future system design methodologies must accept the fact that the underlying hardware will be imperfect, and enable design of robust systems that are resilient to hardware imperfections. Three techniques that can enable a sea change in robust system design are: 1. built-in soft error resilience (BISER), 2. circuit failure prediction, and 3. concurrent autonomous self-test using stored patterns (CASP). Global optimization across multiple abstraction layers is essential for cost-effective robust system design using these techniques.

[1]  Subhasish Mitra,et al.  CASP: Concurrent Autonomous Chip Self-Test Using Stored Test Patterns , 2008, 2008 Design, Automation and Test in Europe.

[2]  Ming Zhang,et al.  Combinational Logic Soft Error Correction , 2006, 2006 IEEE International Test Conference.

[3]  Thomas J. Anderson,et al.  Test connections - tying application to process , 2005, IEEE International Conference on Test, 2005..

[4]  S. Mitra,et al.  Erratic Bit Errors in Latches , 2007, 2007 IEEE International Reliability Physics Symposium Proceedings. 45th Annual.

[5]  Jacob A. Abraham,et al.  Algorithm-Based Fault Tolerance for Matrix Operations , 1984, IEEE Transactions on Computers.

[6]  Shekhar Y. Borkar,et al.  Designing reliable systems from unreliable components: the challenges of transistor variability and degradation , 2005, IEEE Micro.

[7]  Lisa Spainhower,et al.  Commercial fault tolerance: a tale of two systems , 2004, IEEE Transactions on Dependable and Secure Computing.

[8]  Jody Van Horn Towards achieving relentless reliability gains in a server marketplace of teraflops, laptops, kilowatts, and "cost, cost, cost"...: making peace between a black art and the bottom line , 2005, ITC.

[9]  S. Pae,et al.  Random charge effects for PMOS NBTI in ultra-small gate area devices , 2005, 2005 IEEE International Reliability Physics Symposium, 2005. Proceedings. 43rd Annual..

[10]  Melvin A. Breuer,et al.  Roving Emulation as a Fault Detection Mechanism , 1986, IEEE Transactions on Computers.

[11]  Ming Zhang,et al.  Design for Resilience to Soft Errors and Variations , 2007, 13th IEEE International On-Line Testing Symposium (IOLTS 2007).

[12]  Sandip Kundu,et al.  Trends in manufacturing test methods and their implications , 2004, 2004 International Conferce on Test.

[13]  Shubhendu S. Mukherjee,et al.  Detailed design and evaluation of redundant multithreading alternatives , 2002, ISCA.

[14]  J. Jopling,et al.  Erratic fluctuations of sram cache vmin at the 90nm process technology node , 2005, IEEE InternationalElectron Devices Meeting, 2005. IEDM Technical Digest..

[15]  Sanjit A. Seshia,et al.  Verification-Guided Soft Error Resilience , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[16]  R. Baumann The impact of technology scaling on soft error rate performance and limits to the efficacy of error correction , 2002, Digest. International Electron Devices Meeting,.

[17]  Subhasish Mitra,et al.  Gate-Oxide Early Life Failure Prediction , 2008, 26th IEEE VLSI Test Symposium (vts 2008).

[18]  Edward J. McCluskey,et al.  Error detection by duplicated instructions in super-scalar processors , 2002, IEEE Trans. Reliab..

[19]  Phil Nigh,et al.  Test method evaluation experiments and data , 2000, Proceedings International Test Conference 2000 (IEEE Cat. No.00CH37159).

[20]  Srikanth Krishnan,et al.  Impact of negative bias temperature instability on digital circuit reliability , 2005, Microelectron. Reliab..

[21]  Ming Zhang,et al.  Circuit Failure Prediction and Its Application to Transistor Aging , 2007, 25th IEEE VLSI Test Symposium (VTS'07).

[22]  Edward J. McCluskey,et al.  ED4I: Error Detection by Diverse Data and Duplicated Instructions , 2002, IEEE Trans. Computers.

[23]  T. Calin,et al.  Upset hardened memory design for submicron CMOS technology , 1996 .

[24]  N. Seifert,et al.  Robust system design with built-in soft-error resilience , 2005, Computer.

[25]  Mark Johnson,et al.  At-Speed Structural Test For High-Performance ASICs , 2006, 2006 IEEE International Test Conference.

[26]  Ming Zhang,et al.  On the Scalability of Redundancy based SER Mitigation Schemes , 2007, 2007 IEEE International Conference on Integrated Circuit Design and Technology.

[27]  Edward J. McCluskey,et al.  Dependable Computing and Online Testing in Adaptive and Configurable Systems , 2000, IEEE Des. Test Comput..

[28]  Naresh R. Shanbhag,et al.  Sequential Element Design With Built-In Soft Error Resilience , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[29]  P. K. Dubey,et al.  Recognition, Mining and Synthesis Moves Comp uters to the Era of Tera , 2005 .

[30]  Saurabh Dighe,et al.  Adaptive Frequency and Biasing Techniques for Tolerance to Dynamic Temperature-Voltage Variations and Aging , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.