Dynamic reconfiguration of embedded-DRAM caches employing zero data detection based refresh optimisation

Abstract Muti-level cache hierarchy with large sized last level caches (LLCs) have emerged to minimise the performance gap between the processing cores and the main memory. Traditionally, LLCs are made using SRAM technology, however, recent trends have shown that dense and low leakage consuming embedded-DRAMs (eDRAM) are a good alternative to SRAM. The only challenge for this replacement is the time and energy consumed by the periodic refresh demanded by eDRAMs. Towards minimising the number of refreshes and in turn to save energy, in this paper we propose a value based refresh saving method. In particular, certain blocks of data may have zero content and such blocks do not need refreshing. Such zero valued blocks are kept in a dedicated partition in the cache and this partition is never refreshed. The size of the zero value partition is also dynamically reconfigured depending on the variation in the number of zero blocks or the miss rate during the execution of the application. These two dynamic reconfiguration policies save number of refreshes by 38% and 43% respectively. The consequent reduction in stall cycles results in performance improvement of 6 to 9% and also gives considerable energy savings.

[1]  Hisashi Shima,et al.  Resistive Random Access Memory (ReRAM) Based on Metal Oxides , 2010, Proceedings of the IEEE.

[2]  Hemangee K. Kapoor,et al.  Refresh optimised embedded-dram caches based on zero data detection , 2019, SAC.

[3]  Pierfrancesco Foglia,et al.  Exploring the relationship between architectures and management policies in the design of NUCA-based chip multicore systems , 2018, Future Gener. Comput. Syst..

[4]  Sparsh Mittal A Cache Reconfiguration Approach for Saving Leakage and Refresh Energy in Embedded DRAM Caches , 2013, ArXiv.

[5]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[6]  Bruce Jacob,et al.  DRAM Refresh Mechanisms, Penalties, and Trade-Offs , 2016, IEEE Transactions on Computers.

[7]  Richard E. Matick,et al.  A 500 MHz Random Cycle, 1.5 ns Latency, SOI Embedded DRAM Macro Featuring a Three-Transistor Micro Sense Amplifier , 2008, IEEE Journal of Solid-State Circuits.

[8]  Jeffrey S. Vetter,et al.  A Survey Of Techniques for Architecting DRAM Caches , 2016, IEEE Transactions on Parallel and Distributed Systems.

[9]  Hsien-Hsin S. Lee,et al.  Smart Refresh: An Enhanced Memory Controller Design for Reducing Energy in Conventional and 3D Die-Stacked DRAMs , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[10]  Per Stenström,et al.  Zero-Value Caches: Cancelling Loads that Return Zero , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.

[11]  Sung Woo Chung,et al.  Towards refresh-optimized EDRAM-based caches with a selective fine-grain round-robin refresh scheme , 2017, Microprocess. Microsystems.

[12]  Amin Ansari,et al.  Mosaic: Exploiting the spatial locality of process variation to reduce refresh energy in on-chip eDRAM modules , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[13]  Richard Veras,et al.  RAIDR: Retention-aware intelligent DRAM refresh , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[14]  Adel Javanmard,et al.  Versatile refresh: low complexity refresh scheduling for high-throughput multi-banked eDRAM , 2012, SIGMETRICS '12.

[15]  Hemangee K. Kapoor,et al.  Towards Optimizing Refresh Energy in embedded-DRAM Caches using Private Blocks , 2019, ACM Great Lakes Symposium on VLSI.

[16]  Jun Yang,et al.  Frequent value locality and its applications , 2002, TECS.

[17]  Dong Li,et al.  A Survey Of Architectural Approaches for Managing Embedded DRAM and Non-Volatile On-Chip Caches , 2015, IEEE Transactions on Parallel and Distributed Systems.

[18]  Lizy Kurian John,et al.  Elastic Refresh: Techniques to Mitigate Refresh Penalties in High Density Memory , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[19]  Luca Benini,et al.  Energy-Efficient Value-Based Selective Refresh for Embedded DRAMs , 2005, PATMOS.

[20]  André Seznec,et al.  Zero-content augmented caches , 2009, ICS '09.

[21]  Jaehyuk Huh,et al.  A NUCA Substrate for Flexible CMP Cache Sharing , 2007, IEEE Transactions on Parallel and Distributed Systems.

[22]  Wen Wang,et al.  Adaptive refresh structure for gain cell embedded DRAM , 2016, Microelectron. J..

[23]  Zoran Jaksic,et al.  DRAM-based coherent caches and how to take advantage of the coherence protocol to reduce the refresh energy , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[24]  Philip G. Emma,et al.  Rethinking Refresh: Increasing Availability and Reducing Power in DRAM for Cache Applications , 2008, IEEE Micro.

[25]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[26]  Ke Zhou,et al.  Alleviating Memory Refresh Overhead via Data Compression for High Performance and Energy Efficiency , 2018, IEEE Transactions on Parallel and Distributed Systems.

[27]  Bruce Jacob,et al.  Technology comparison for large last-level caches (L3Cs): Low-leakage SRAM, low write-energy STT-RAM, and refresh-optimized eDRAM , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[28]  Per Stenström,et al.  A Robust Main-Memory Compression Scheme , 2005, ISCA 2005.

[29]  Onur Mutlu,et al.  An experimental study of data retention behavior in modern DRAM devices: implications for retention time profiling mechanisms , 2013, ISCA.

[30]  Amin Ansari,et al.  Refrint: Intelligent refresh to minimize power in on-chip multiprocessor cache hierarchies , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[31]  Chia-Lin Yang,et al.  Value-conscious cache: simple technique for reducing cache access power , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.