Versatile refresh: low complexity refresh scheduling for high-throughput multi-banked eDRAM

Multi-banked embedded DRAM (eDRAM) has become increasingly popular in high-performance systems. However, the data retention problem of eDRAM is exacerbated by the larger number of banks and the high-performance environment in which it is deployed: The data retention time of each memory cell decreases while the number of cells to be refreshed increases. For this, multi-bank designs offer a concurrent refresh mode, where idle banks can be refreshed concurrently during read and write operations. However, conventional techniques such as periodically scheduling refreshes---with priority given to refreshes in case of conflicts with reads or writes---have variable performance, increase read latency, and can perform poorly in worst case memory access patterns. We propose a novel refresh scheduling algorithm that is low-complexity, produces near-optimal throughput with universal guarantees, and is tolerant to bursty memory access patterns. The central idea is to decouple the scheduler into two simple-to-implement modules: one determines which cell to refresh next and the other determines when to force an idle cycle in all banks. We derive necessary and sufficient conditions to guarantee data integrity for all access patterns, with any given number of banks, rows per bank, read/write ports and data retention time. Our analysis shows that there is a tradeoff between refresh overhead and burst tolerance and characterizes this tradeoff precisely. The algorithm is shown to be near-optimal and achieves, for instance, 76.6% reduction in worst-case refresh overhead from the periodic refresh algorithm for a 250MHz eDRAM with 10us retention time and 16 banks each with 128 rows. Simulations with Apex-Map synthetic benchmarks and switch lookup table traffic show that VR can almost completely hide the refresh overhead for memory accesses with moderate-to-high multiplexing across memory banks.

[1]  Chris Spear SystemVerilog for Verification, Second Edition: A Guide to Learning the Testbench Language Features , 2008 .

[2]  Rei-Fu Huang,et al.  Testing Methodology of Embedded DRAMs , 2008, 2008 IEEE International Test Conference.

[3]  Erich Strohmaier,et al.  Characterizing the Relation Between Apex-Map Synthetic Probes and Reuse Distance Distributions , 2010, 2010 39th International Conference on Parallel Processing.

[4]  Norbert Wehn,et al.  Embedded DRAM Development: Technology, Physical Design, and Application Issues , 2001, IEEE Des. Test Comput..

[5]  Lizy Kurian John,et al.  ESKIMO - energy savings using semantic knowledge of inconsequential memory occupancy for DRAM subsystem , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[6]  T. Takayanagi,et al.  A 60-MHz 240-mW MPEG-4 videophone LSI with 16-Mb embedded DRAM , 2000, IEEE Journal of Solid-State Circuits.

[7]  Lizy Kurian John,et al.  Elastic Refresh: Techniques to Mitigate Refresh Penalties in High Density Memory , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[8]  Onur Mutlu,et al.  Self-Optimizing Memory Controllers: A Reinforcement Learning Approach , 2008, 2008 International Symposium on Computer Architecture.

[9]  Jung Ho Ahn,et al.  A Comprehensive Memory Modeling Tool and Its Application to the Design and Analysis of Future Memory Hierarchies , 2008, 2008 International Symposium on Computer Architecture.

[10]  Erich Strohmaier,et al.  Apex-Map: A Global Data Access Benchmark to Analyze HPC Systems and Parallel Programming Paradigms , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[11]  M. Wordeman,et al.  An 800-MHz embedded DRAM with a concurrent refresh mode , 2005, IEEE Journal of Solid-State Circuits.

[12]  In-Cheol Park,et al.  A 80/20 MHz 160 mW multimedia processor integrated with embedded DRAM MPEG-4 accelerator and 3D rendering engine for mobile applications , 2001, 2001 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC (Cat. No.01CH37177).

[13]  John E. Barth,et al.  Embedded DRAM: Technology platform for the Blue Gene/L chip , 2005, IBM J. Res. Dev..

[14]  S. Okwit,et al.  ON SOLID-STATE CIRCUITS. , 1963 .

[15]  K. Ohmori,et al.  A 60 MHz 240 mW MPEG-4 video-phone LSI with 16 Mb embedded DRAM , 2000, 2000 IEEE International Solid-State Circuits Conference. Digest of Technical Papers (Cat. No.00CH37056).

[16]  Erik Nelson,et al.  A 45 nm SOI Embedded DRAM Macro for the POWER™ Processor 32 MByte On-Chip L3 Cache , 2011, IEEE Journal of Solid-State Circuits.

[17]  Hsien-Hsin S. Lee,et al.  Smart Refresh: An Enhanced Memory Controller Design for Reducing Energy in Conventional and 3D Die-Stacked DRAMs , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[18]  Toshiaki Kirihata High Performance Embedded Dynamic Random Access Memory in Nano-Scale Technologies , 2010 .

[19]  Erich Strohmaier,et al.  Architecture independent performance characterization and benchmarking for scientific applications , 2004, The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, 2004. (MASCOTS 2004). Proceedings..

[20]  Erich Strohmaier,et al.  Quantifying Locality In The Memory Access Patterns of HPC Applications , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[21]  Norbert Wehn,et al.  Embedded DRAM architectural trade-offs , 1998, Proceedings Design, Automation and Test in Europe.