CPPC: Correctable parity protected cache

Due to shrinking feature sizes processors are becoming more vulnerable to soft errors. Write-back caches are particularly vulnerable since they hold dirty data that do not exist in other memory levels. While conventional error correcting codes can protect write-back caches, it has been shown that they are expensive in terms of area and power. This paper proposes a new reliable write-back cache called Correctable Parity Protected Cache (CPPC) which adds error correction capability to a parity-protected cache. For this purpose, CPPC augments a write-back parity-protected cache with two registers: the first register stores the XOR of all data written to the cache and the second register stores the XOR of all dirty data that are removed from the cache. CPPC relies on parity to detect a fault and then on the two XOR registers to correct faults. By a novel combination of byte shifting and parity interleaving CPPC corrects both single and spatial multi-bit faults to provide a high degree of reliability. We compare CPPC with one-dimensional parity, SECDED (Single Error Correction Double Error Detection) and two-dimensional parity-protected caches. Our simulation results show that CPPC provides a high level of reliability while its overheads are less than the overheads of SECDED and two-dimensional parity.

[1]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[2]  Arun K. Somani,et al.  Area efficient architectures for information integrity in cache memories , 1999, ISCA.

[3]  Richard E. Kessler,et al.  The Alpha 21264 microprocessor , 1999, IEEE Micro.

[4]  Nhon Quach,et al.  High Availability and Reliability in the Itanium Processor , 2000, IEEE Micro.

[5]  Wei Zhang,et al.  ICR: in-cache replication for enhancing data cache reliability , 2003, 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..

[6]  J. Maiz,et al.  Characterization of multi-bit soft error events in advanced SRAMs , 2003, IEEE International Electron Devices Meeting 2003.

[7]  Cameron McNairy,et al.  Itanium 2 Processor Microarchitecture , 2003, IEEE Micro.

[8]  Mahmut T. Kandemir,et al.  Soft error and energy consumption interactions: a data cache perspective , 2004, Proceedings of the 2004 International Symposium on Low Power Electronics and Design (IEEE Cat. No.04TH8758).

[9]  Mehdi Baradaran Tahoori,et al.  Balancing Performance and Reliability in the Memory Hierarchy , 2005, IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005..

[10]  Luca Benini,et al.  Error control schemes for on-chip communication links: the energy-reliability tradeoff , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[11]  Kunle Olukotun,et al.  Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.

[12]  Wei Zhang,et al.  Replication cache: a small fully associative cache to improve data cache reliability , 2005, IEEE Transactions on Computers.

[13]  Daniel J. Sorin,et al.  Choosing an Error Protection Scheme for a Microprocessor's L1 Data Cache , 2006, 2006 International Conference on Computer Design.

[14]  Soontae Kim Area-Efficient Error Protection for Caches , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[15]  Babak Falsafi,et al.  Multi-bit Error Tolerant Caches Using Two-Dimensional Error Coding , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[16]  David Money Harris,et al.  Energy-delay tradeoffs in 32-bit static shifter designs , 2008, 2008 IEEE International Conference on Computer Design.

[17]  Doe Hyun Yoon,et al.  Memory mapped ECC: low-cost error protection for last level caches , 2009, ISCA '09.

[18]  M. Annavaram,et al.  Soft error benchmarking of L2 caches with PARMA , 2011, PERV.