APast Future Time Quantized AVF : A Means of Capturing Vulnerability Variations over Small Windows of Time

Architectural vulnerability factor (AVF) is the probability that a transient fault in a bit, gate, or transistor becomes a user-visible error. AVFs vary widely across time, applications, and bits. Usually AVFs averaged over time and across applications are used to compute the overall soft error rate of a processor. Average AVFs, however, cannot express the short-term vulnerability variations of a bit as they tend to settle down to a fixed value over time. To quantify the vulnerability of bits over short durations, we introduce the concept of Quantized AVF (Q-AVF). Q-AVF expresses the vulnerability of a bit to soft errors over short intervals of time. The average AVF of a bit for a specific interval can be computed as a weighted average of Q-AVFs of all the quanta in that interval. Our analysis of Q-AVF shows significant run-time variation—as much as 80% or more for certain applications. By capturing vulnerability variations over short windows of time, Q-AVFs provide better opportunities for reducing the performance and power overhead of reliability solutions at run-time. To compute Q-AVF in hardware, linear regression analysis is used to create highly accurate equations that can be implemented with eight simple parameters. These parameters can accurately track Q-AVFs of various structures throughout the processor pipeline. Implementing these equations with fewer parameters is critical to reduce the complexity of run-time Q-AVF tracking, thereby making Q-AVF estimation in hardware practical.

[1]  Robert S. Swarz,et al.  Reliable Computer Systems: Design and Evaluation , 1992 .

[2]  Stephen H. Gunther,et al.  Managing the Impact of Increasing Microprocessor Power Consumption , 2001 .

[3]  Asim: A Performance Model Framework , 2002, Computer.

[4]  Brad Calder,et al.  Automatically characterizing large scale program behavior , 2002, ASPLOS X.

[5]  Shubhendu S. Mukherjee,et al.  A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[6]  Joel S. Emer,et al.  Techniques to reduce the soft error rate of a high-performance microprocessor , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[7]  Sanjay J. Patel,et al.  Characterizing the effects of transient faults on a high-performance processor pipeline , 2004, International Conference on Dependable Systems and Networks, 2004.

[8]  Arijit Biswas,et al.  Computing architectural vulnerability factors for address-based structures , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[9]  T. N. Vijaykumar,et al.  Opportunistic transient-fault detection , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[10]  Tao Li,et al.  Characterizing Microarchitecture Soft Error Vulnerability Phase Behavior , 2006, 14th IEEE International Symposium on Modeling, Analysis, and Simulation.

[11]  David I. August,et al.  Configurable Transient Fault Detection via Dynamic Binary Translation , 2006 .

[12]  Anand Sivasubramaniam,et al.  SlicK: slice-based locality exploitation for efficient redundant multithreading , 2006, ASPLOS XII.

[13]  Anand Sivasubramaniam,et al.  Mechanisms for bounding vulnerabilities of processor structures , 2007, ISCA '07.

[14]  Sudhanva Gurumurthi,et al.  Dynamic prediction of architectural vulnerability from microarchitectural state , 2007, ISCA '07.

[15]  Xiaodong Li,et al.  Online Estimation of Architectural Vulnerability Factor for Soft Errors , 2008, 2008 International Symposium on Computer Architecture.