Replica Victim Caching to Improve Reliability of In-Cache Replication

Soft error conscious cache design is a necessity for reliable computing. ECC or parity-based integrity checking techniques in use today either compromise performance for reliability or vice versa. The recently-proposed ICR (In-Cache Replication) scheme can enhance data reliability with minimal impact on performance, however, it can only exploit a limited space for replication and thus cannot solve the conflicts between the replicas and the primary data without compromising either performance or reliability. This paper proposes to add a small cache, called replica victim cache, to solve this dilemma effectively. Our experimental results show that a replica victim cache of 4 entries can increase reliability of L1 data caches 21.7% more than ICR without impacting performance, and the area overhead is within 10%.

[1]  Chin-Long Chen,et al.  Error-Correcting Codes for Semiconductor Memory Applications: A State-of-the-Art Review , 1984, IBM J. Res. Dev..

[2]  Norman P. Jouppi,et al.  Improving direct-mapped cache performance by the addition of a small fully-associative cache and pre , 1990, ISCA 1990.

[3]  Bella Bose,et al.  Burst asymmetric/unidirectional error correcting/detecting codes , 1990, [1990] Digest of Papers. Fault-Tolerant Computing: 20th International Symposium.

[4]  Hideki Imai Essentials of Error-Control Coding Techniques , 1990 .

[5]  C. L. Chen,et al.  APPENDIX A – Error-Correcting Codes for Semiconductor Memory Applications: A State-of-the-Art Review , 1992 .

[6]  Johan Karlsson,et al.  Using heavy-ion radiation to validate fault-handling mechanisms , 1994, IEEE Micro.

[7]  Janusz Sosnowski,et al.  Transient fault tolerance in digital systems , 1994, IEEE Micro.

[8]  Norman P. Jouppi,et al.  CACTI: an enhanced cache access and cycle time model , 1996, IEEE J. Solid State Circuits.

[9]  Eiji Fujiwara,et al.  A Class of Error Control Codes for Byte Organized Memory Systems -SbEC-(Sb+S)ED Codes- , 1997, IEEE Trans. Computers.

[10]  Margaret Martonosi,et al.  Cache decay: exploiting generational behavior to reduce cache leakage power , 2001, ISCA 2001.

[11]  Lorenzo Alvisi,et al.  Modeling the effect of technology trends on the soft error rate of combinational logic , 2002, Proceedings International Conference on Dependable Systems and Networks.

[12]  Wei Zhang,et al.  ICR: in-cache replication for enhancing data cache reliability , 2003, 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..