TS Cache: A Fast Cache With Timing-Speculation Mechanism Under Low Supply Voltages

To mitigate the ever-worsening “power wall” problem, more and more applications need to expand their working voltage to the wide-voltage range including the near-threshold region. However, the read delay distribution of the static random access memory (SRAM) cells under the near-threshold voltage shows a more serious long-tail characteristic than that under the nominal voltage due to the process fluctuation. Such degradation of SRAM delay makes the SRAM-based cache a performance bottleneck of systems as well. To avoid unreliable data reading, circuit-level studies use larger/more transistors in a bitcell by sacrificing chip area and the static power of cache arrays. Architectural studies propose the auxiliary error correction or block disabling/remapping methods in fault-tolerant caches, which worsen both the hit latency and energy efficiency due to the complex accessing logic. This article proposes a timing-speculation (TS) cache to boost the cache frequency and improve energy efficiency under low supply voltages. In the TS cache, the voltage differences of bitlines (BLs) are continuously evaluated twice by a sense amplifier (SA), and the access timing error can be detected much earlier than that in prior methods. According to the measurement results from the fabricated chips, the TS L1 cache aggressively increases its frequency to $1.62\times $ and $1.92\times $ compared with the conventional scheme at 0.5- and 0.6-V supply voltages, respectively.

[1]  David Blaauw,et al.  17.3 A reconfigurable dual-port memory with error detection and correction in 28nm FDSOI , 2016, 2016 IEEE International Solid-State Circuits Conference (ISSCC).

[2]  TingTing Hwang,et al.  A Novel Cache-Utilization-Based Dynamic Voltage-Frequency Scaling Mechanism for Reliability Enhancements , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[3]  Hiroyuki Yamauchi,et al.  A 210mV 7.3MHz 8T SRAM with dual data-aware write-assists and negative read wordline for high cell-stability, speed and area-efficiency , 2013, 2013 Symposium on VLSI Technology.

[4]  A.P. Chandrakasan,et al.  A 256-kb 65-nm Sub-threshold SRAM Design for Ultra-Low-Voltage Operation , 2007, IEEE Journal of Solid-State Circuits.

[5]  David Blaauw,et al.  Near-Threshold Computing: Reclaiming Moore's Law Through Energy Efficient Integrated Circuits , 2010, Proceedings of the IEEE.

[6]  Tien-Fu Chen,et al.  Zero-Counting and Adaptive-Latency Cache Using a Voltage-Guardband Breakthrough for Energy-Efficient Operations , 2016, IEEE Transactions on Circuits and Systems II: Express Briefs.

[7]  Massimo Alioto,et al.  Ultra-Low Power VLSI Circuit Design Demystified and Explained: A Tutorial , 2012, IEEE Transactions on Circuits and Systems I: Regular Papers.

[8]  David A. Patterson,et al.  An Out-of-Order RISC-V Processor with Resilient Low-Voltage Operation in 28NM CMOS , 2018, 2018 IEEE Symposium on VLSI Circuits.

[9]  Karin Strauss,et al.  Use ECP, not ECC, for hard failures in resistive memories , 2010, ISCA.

[10]  K. Roy,et al.  Modeling and estimation of failure probability due to parameter variations in nano-scale SRAMs for yield enhancement , 2004, 2004 Symposium on VLSI Circuits. Digest of Technical Papers (IEEE Cat. No.04CH37525).

[11]  Wei Wu,et al.  Improving cache lifetime reliability at ultra-low voltages , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[12]  Rakesh Kumar,et al.  Rescuing Uncorrectable Fault Patterns in On-Chip Memories through Error Pattern Transformation , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[13]  David Blaauw,et al.  Timing error correction techniques for voltage-scalable on-chip memories , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[14]  Teresa Monreal Arnal,et al.  Concertina: Squeezing in Cache Content to Operate at Near-Threshold Voltage , 2016, IEEE Transactions on Computers.

[15]  Trevor Mudge,et al.  Razor: a low-power pipeline based on circuit-level timing speculation , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[16]  Jaydeep P. Kulkarni,et al.  Improving multi-core performance using mixed-cell cache architecture , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[17]  Soontae Kim,et al.  Designing a Resilient L1 Cache Architecture to Process Variation-Induced Access-Time Failures , 2016, IEEE Transactions on Computers.

[18]  Omer Khan,et al.  NUCA-L1 , 2014, ACM Trans. Archit. Code Optim..

[19]  Longxing Shi,et al.  A Double Sensing Scheme With Selective Bitline Voltage Regulation for Ultralow-Voltage Timing Speculative SRAM , 2018, IEEE Journal of Solid-State Circuits.

[20]  Wei Wu,et al.  Energy-efficient cache design using variable-strength error-correcting codes , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[21]  Wei Chen,et al.  The 65-nm 16-MB Shared On-Die L3 Cache for the Dual-Core Intel Xeon Processor 7100 Series , 2007, IEEE Journal of Solid-State Circuits.

[22]  Nam Sung Kim,et al.  Minimizing total area of low-voltage SRAM arrays through joint optimization of cell size, redundancy, and ECC , 2010, 2010 IEEE International Conference on Computer Design.