System-Level Vulnerability Estimation for Data Caches

Over the past few years, radiation-induced transient errors, also referred to as soft errors, have been a severe threat to the data integrity of high-end and mainstream processors. Recent studies show that cache memories are among the most vulnerable components to soft errors within high-performance processors. Accurate modeling of the Vulnerability Factor (VF) is an essential step towards developing cost-effective protection techniques for cache memory. Although Fault Injection (FI) techniques can provide relatively accurate VF estimations they are often very time consuming. To overcome the limitation, recent analytical models were proposed to compute the cache VF in a timely fashion. In this paper, we extend previous work and propose an alternative analytical model to compute System-level Vulnerability Factor (SVF) for both write-through and write-back data caches. In our proposed analytical model, we take into account both read frequency and error masking to compute system-level vulnerability of data cache. Previously suggested modeling techniques overlook the issues of read frequency and error masking, mainly focusing on time periods in which an error could propagate in the system. In this work we show that overlooking these two parameters can significantly impact the system-level VFs for data caches. We report our estimations for SPEC’2K benchmarks and compare to previously suggested models. Our experimental results show that the proposed modeling technique changes previous VF estimations by up to 40%.

[1]  Shubu Mukherjee,et al.  Architecture Design for Soft Errors , 2008 .

[2]  Xiaodong Li,et al.  SoftArch: an architecture-level tool for modeling and analyzing soft errors , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).

[3]  Massimo Violante,et al.  An accurate analysis of the effects of soft errors in the instruction and data caches of a pipelined microprocessor , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.

[4]  R. Velazco,et al.  Impact of data cache memory on the single event upset-induced error rate of microprocessors , 2003 .

[5]  Richard E. Kessler,et al.  The Alpha 21264 microprocessor , 1999, IEEE Micro.

[6]  Y. Yagil,et al.  A systematic approach to SER estimation and solutions , 2003, 2003 IEEE International Reliability Physics Symposium Proceedings, 2003. 41st Annual..

[7]  Joel Emer,et al.  A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[8]  Mehdi Baradaran Tahoori,et al.  Balancing Performance and Reliability in the Memory Hierarchy , 2005, IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005..

[9]  Shuai Wang,et al.  On the Characterization and Optimization of On-Chip Cache Reliability against Soft Errors , 2009, IEEE Transactions on Computers.

[10]  Sanjay J. Patel,et al.  Examining ACE analysis reliability estimates using fault-injection , 2007, ISCA '07.

[11]  Joel S. Emer,et al.  Techniques to reduce the soft error rate of a high-performance microprocessor , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[12]  Kishor S. Trivedi,et al.  A cache error propagation model , 1997, Proceedings Pacific Rim International Symposium on Fault-Tolerant Systems.

[13]  Wei Zhang,et al.  ICR: in-cache replication for enhancing data cache reliability , 2003, 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..

[14]  H. Asadi,et al.  Soft Error Derating Computation in Sequential Circuits , 2006, 2006 IEEE/ACM International Conference on Computer Aided Design.

[15]  Arijit Biswas,et al.  Computing Accurate AVFs using ACE Analysis on Performance Models: A Rebuttal , 2008, IEEE Computer Architecture Letters.

[16]  Arun K. Somani,et al.  Area efficient architectures for information integrity in cache memories , 1999, ISCA.

[17]  Robert Baumann,et al.  Soft errors in advanced computer systems , 2005, IEEE Design & Test of Computers.

[18]  Joel Emer,et al.  Computing Architectural Vulnerability Factors for Address-Based Structures , 2005, ISCA 2005.

[19]  Pia Sanda,et al.  Statistical Fault Injection , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[20]  J. Fortes,et al.  Sim-SODA : A Unified Framework for Architectural Level Software Reliability Analysis , 2006 .

[21]  R.C. Baumann,et al.  Radiation-induced soft errors in advanced semiconductor technologies , 2005, IEEE Transactions on Device and Materials Reliability.

[22]  Sarita V. Adve,et al.  Accurate microarchitecture-level fault modeling for studying hardware faults , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[23]  Amirali Baniasadi,et al.  Using Input-to-Output Masking for System-level Vulnerability estimation in high-performance processors , 2010, 2010 15th CSI International Symposium on Computer Architecture and Digital Systems.

[24]  N. Seifert,et al.  Robust system design with built-in soft-error resilience , 2005, Computer.

[25]  Xiaodong Li,et al.  Architecture-Level Soft Error Analysis: Examining the Limits of Common Assumptions , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[26]  Arijit Biswas,et al.  Computing architectural vulnerability factors for address-based structures , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[27]  Arun K. Somani,et al.  Soft error sensitivity characterization for microprocessor dependability enhancement strategy , 2002, Proceedings International Conference on Dependable Systems and Networks.

[28]  Gwan S. Choi,et al.  On-chip cache memory resilience , 1998, Proceedings Third IEEE International High-Assurance Systems Engineering Symposium (Cat. No.98EX231).

[29]  Mehdi Baradaran Tahoori,et al.  A Field Analysis of System-level Effects of Soft Errors Occurring in Microprocessors used in Information Systems , 2008, 2008 IEEE International Test Conference.