Enabling efficient fine-grained DRAM activations with interleaved I/O

DRAM contributes a significant part of the total system energy consumption, and row activation is one of the most energy inefficient components. Prior works on fine-grained DRAM activation rely on increasing the number of local wires to avoid degrading performance, which adds area overheads. This work proposes interleaved I/O to allow data transferring from different partially activated banks to share the global I/O. The proposed DRAM architecture allows half-, quarter-, or one-eighth- page activations without changing the wires. The system performance is competitive as compared with other fine-grained activation designs. For the evaluated benchmarks, an average of up to 15.7% performance improvement is achieved among all of the configurations. Furthermore, the total DRAM energy can be reduced by an average of 11.2% for halfpage, 17.2% for quarterpage, and 22.3% for one-eighth-page.

[1]  Zhao Zhang,et al.  Mini-rank: Adaptive DRAM architecture for improving memory power efficiency , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[2]  Jack Dongarra,et al.  Introduction to the HPC Challenge Benchmark Suite , 2005 .

[3]  Thomas Vogelsang,et al.  Understanding the Energy Consumption of Dynamic Random Access Memories , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[4]  Onur Mutlu,et al.  A case for exploiting subarray-level parallelism (SALP) in DRAM , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[5]  Norman P. Jouppi,et al.  Rethinking DRAM design and organization for energy-constrained multi-cores , 2010, ISCA.

[6]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[7]  Hyeonggyu Kim,et al.  Partial Row Activation for Low-Power DRAM System , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[8]  Gokcen Kestor,et al.  Quantifying the energy cost of data movement in scientific applications , 2013, 2013 IEEE International Symposium on Workload Characterization (IISWC).

[9]  David H. Bailey,et al.  The NAS parallel benchmarks summary and preliminary results , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[10]  Bruce Jacob,et al.  Fine-Grained Activation for Power Reduction in DRAM , 2010, IEEE Micro.

[11]  Carole-Jean Wu,et al.  Quantifying the energy cost of data movement for emerging smart phone workloads on mobile platforms , 2014, 2014 IEEE International Symposium on Workload Characterization (IISWC).

[12]  Tao Zhang,et al.  Half-DRAM: A high-bandwidth and low-power DRAM architecture from the rethinking of fine-grained activation , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[13]  Mark Horowitz,et al.  Improving energy efficiency of DRAM by exploiting half page row access , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[14]  David W. Nellans,et al.  Micro-pages: increasing DRAM efficiency with locality-aware data placement , 2010, ASPLOS XV.

[15]  Paul D. Franzon,et al.  FreePDK: An Open-Source Variation-Aware Design Kit , 2007, 2007 IEEE International Conference on Microelectronic Systems Education (MSE'07).

[16]  Spec Omp2001 , 2011, Encyclopedia of Parallel Computing.