Towards refresh-optimized EDRAM-based caches with a selective fine-grain round-robin refresh scheme

Recently, EDRAM cells have gained much attention as a promising alternative to construct on-chip memories. However, due to inherent characteristics of DRAM cells, they need to be refreshed periodically, causing a huge refresh energy burden. Particularly, employing EDRAM cells in large-scale last-level caches will make refresh burden much higher due to their large capacity. In this paper, we propose a selective fine-grain round-robin refresh scheme for both performance improvement and refresh energy reduction. To reduce bank conflicts between normal cache accesses and refresh operations, we employ a refresh scheme which refreshes cache lines in a bank-wise round-robin fashion. We also apply a selective refresh depending on the inclusive information in cache hierarchies. For the data which reside in both LLC and upper-level cache (i.e., L2 cache), the data access will be filtered by the upper-level cache. Based on this insight, we skip the refresh to the cache block in the EDRAM-based LLC which also exists in the upper-level caches. By doing so, we can reduce unnecessary refresh operations in EDRAM-based LLCs. According to our evaluation, our proposed scheme improves performance by 7.3% and reduces energy per instruction by 13.3% compared to the baseline all-bank refresh scheme.

[1]  José F. Martínez,et al.  Understanding and mitigating refresh overheads in high-density DDR4 DRAM systems , 2013, ISCA.

[2]  Samira Manabi Khan,et al.  Sampling Dead Block Prediction for Last-Level Caches , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[3]  Hsien-Hsin S. Lee,et al.  Smart Refresh: An Enhanced Memory Controller Design for Reducing Energy in Conventional and 3D Die-Stacked DRAMs , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[4]  Sung Kyu Lim,et al.  Exploration of temperature-aware refresh schemes for 3D stacked eDRAM caches , 2016, Microprocess. Microsystems.

[5]  Bruce Jacob,et al.  DRAM Refresh Mechanisms, Penalties, and Trade-Offs , 2016, IEEE Transactions on Computers.

[6]  Song Liu,et al.  Flikker: saving DRAM refresh-power through critical data partitioning , 2011, ASPLOS XVI.

[7]  Richard E. Matick,et al.  Logic-based eDRAM: Origins and rationale for use , 2005, IBM J. Res. Dev..

[8]  Hyesoon Kim,et al.  FLEXclusion: Balancing cache capacity and on-chip bandwidth via Flexible Exclusion , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[9]  Amin Ansari,et al.  Mosaic: Exploiting the spatial locality of process variation to reduce refresh energy in on-chip eDRAM modules , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[10]  Richard Veras,et al.  RAIDR: Retention-aware intelligent DRAM refresh , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[11]  Adel Javanmard,et al.  Versatile refresh: low complexity refresh scheduling for high-throughput multi-banked eDRAM , 2012, SIGMETRICS '12.

[12]  Moinuddin K. Qureshi,et al.  Refresh pausing in DRAM memory systems , 2014, TACO.

[13]  Bruce Jacob,et al.  Coordinated refresh: Energy efficient techniques for DRAM refresh scheduling , 2013, International Symposium on Low Power Electronics and Design (ISLPED).

[14]  Pierre Michaud,et al.  Demystifying multicore throughput metrics , 2013, IEEE Computer Architecture Letters.

[15]  Wei Wu,et al.  Reducing cache power with low-cost, multi-bit error-correcting codes , 2010, ISCA.

[16]  Onur Mutlu,et al.  Improving DRAM performance by parallelizing refreshes with accesses , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[17]  Lizy Kurian John,et al.  Elastic Refresh: Techniques to Mitigate Refresh Penalties in High Density Memory , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[18]  Bruce Jacob,et al.  Technology comparison for large last-level caches (L3Cs): Low-leakage SRAM, low write-energy STT-RAM, and refresh-optimized eDRAM , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[19]  Sung Woo Chung,et al.  Exploiting Refresh Effect of DRAM Read Operations: A Practical Approach to Low-Power Refresh , 2016, IEEE Transactions on Computers.

[20]  Amin Ansari,et al.  Refrint: Intelligent refresh to minimize power in on-chip multiprocessor cache hierarchies , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[21]  José Duato,et al.  A reuse-based refresh policy for energy-aware eDRAM caches , 2015, Microprocess. Microsystems.