Reliable Ultra-Low-Voltage Cache Design for Many-Core Systems

We reduce cache supply voltage below the normally acceptable VDDMIN, in order to improve overall many-core system energy efficiency. Based on the observation that cache lines contain mostly one hard faulty cell at these ultra-low supply voltages, we exploit existing double-error correcting triple-error detecting codes, together with cache line disabling, to handle both soft and hard cache faults, thus enabling reliable ultra-low supply voltage cache operation. Compared to the next-best approach in the research literature, the proposed method reduces system energy consumption by up to 25% and energy-execution time product by nearly 10%, while introducing only 0.28% storage overhead and marginal instruction per cycle degradation, when the target yield loss rate is 1/1000.

[1]  S. E. Schuster Multiple word/bit line redundancy for semiconductor memories , 1978 .

[2]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[3]  M. Horiguchi,et al.  Redundancy techniques for high-density DRAMs , 1997, 1997 Proceedings Second Annual IEEE International Conference on Innovative Systems in Silicon.

[4]  Kaushik Roy,et al.  A process-tolerant cache architecture for improved yield in nanoscale technologies , 2005, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[5]  Ram Huggahalli,et al.  Impact of Cache Coherence Protocols on the Processing of Network Traffic , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[6]  Babak Falsafi,et al.  Multi-bit Error Tolerant Caches Using Two-Dimensional Error Coding , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[7]  K. Reick,et al.  Fault-Tolerant Design of the IBM Power6 Microprocessor , 2007, IEEE Micro.

[8]  J. Draper,et al.  Parallel double error correcting code design to mitigate multi-bit upsets in SRAMs , 2008, ESSCIRC 2008 - 34th European Solid-State Circuits Conference.

[9]  A. Kumar,et al.  Implementation of an 8-Core, 64-Thread, Power-Efficient SPARC Server on a Chip , 2008, IEEE Journal of Solid-State Circuits.

[10]  Jung Ho Ahn,et al.  A Comprehensive Memory Modeling Tool and Its Application to the Design and Analysis of Future Memory Hierarchies , 2008, 2008 International Symposium on Computer Architecture.

[11]  Alaa R. Alameldeen,et al.  Trading off Cache Capacity for Reliability to Enable Low Voltage Operation , 2008, 2008 International Symposium on Computer Architecture.

[12]  Wei Wu,et al.  Improving cache lifetime reliability at ultra-low voltages , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[13]  Borivoje Nikolic,et al.  Large-Scale SRAM Variability Characterization in 45 nm CMOS , 2009, IEEE Journal of Solid-State Circuits.

[14]  Jaume Abella,et al.  Low Vccmin fault-tolerant cache with highly predictable performance , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[15]  Nam Sung Kim,et al.  Minimizing total area of low-voltage SRAM arrays through joint optimization of cell size, redundancy, and ECC , 2010, 2010 IEEE International Conference on Computer Design.

[16]  George Kurian,et al.  Graphite: A distributed parallel simulator for multicores , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[17]  Wei Wu,et al.  Energy-efficient cache design using variable-strength error-correcting codes , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[18]  Nam Sung Kim,et al.  Low-voltage on-chip cache architecture using heterogeneous cell sizes for high-performance processors , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[19]  Chen Sun,et al.  Cross-layer Energy and Performance Evaluation of a Nanophotonic Manycore Processor System Using Real Application Workloads , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.