Mitigating Wordline Crosstalk Using Adaptive Trees of Counters

DRAM technology scaling has the undesirable side effect of degrading cell reliability. One such concern of deeply scaled DRAMs is the increased coupling between adjacent cells, commonly referred to as crosstalk. High access frequency of certain rows in the DRAM may cause data loss in cells of physically adjacent rows due to crosstalk. The malicious exploit of this crosstalk by repeatedly accessing a row to induce this effect is known as row hammering. Additionally, inadvertent row hammering may also occur due to the natural weighted nature of applications' access patterns. In this paper, we analyze the efficiency of existing approaches for mitigating wordline crosstalk and demonstrate that they have been conservatively designed. Given the unbalanced nature of DRAM accesses, a small group of dynamically allocated counters in banks can deterministically detect "hot" rows and mitigate crosstalk. Based on our findings, we propose a Counter-based Adaptive Tree (CAT) approach to mitigate wordline crosstalk using adaptive trees of counters to guide appropriate refreshing of vulnerable rows. The key idea is to tune the distribution of the counters to the rows in a bank based on the memory reference patterns. In contrast to deterministic solutions, CAT utilizes fewer counters, making it practically feasible to be implemented on-chip. Compared to existing probabilistic approaches, CAT more precisely refreshes rows vulnerable to crosstalk based on their access frequency. Experimental results on workloads from four benchmark suites show that CAT reduces the Crosstalk Mitigation Refresh Power Overhead in quad-core systems to 7%, which is an improvement over the 21% and 18% incurred in the leading deterministic and probabilistic approaches, respectively. Moreover, CAT incurs very low performance overhead (~0.5%). Hardware synthesis evaluation shows that CAT can be implemented on-chip with only a nominal area overhead.

[1]  Seth H. Pugsley,et al.  USIMM : the Utah SImulated Memory Module , 2012 .

[2]  Sanu Mathew,et al.  2.4GHz 7mW all-digital PVT-variation tolerant True Random Number Generator in 45nm CMOS , 2010, 2010 Symposium on VLSI Circuits.

[3]  José F. Martínez,et al.  Understanding and mitigating refresh overheads in high-density DDR4 DRAM systems , 2013, ISCA.

[4]  Rami G. Melhem,et al.  PRES: Pseudo-Random Encoding Scheme to increase the bit flip reduction in the memory , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[5]  Hsien-Hsin S. Lee,et al.  Smart Refresh: An Enhanced Memory Controller Design for Reducing Energy in Conventional and 3D Die-Stacked DRAMs , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[6]  Nanjian Wu,et al.  An ultra-low power CMOS random number generator , 2008 .

[7]  Hiroshi Sasaki,et al.  Power-Efficient Breadth-First Search with DRAM Row Buffer Locality-Aware Address Mapping , 2016, 2016 High Performance Graph Data Management and Processing Workshop (HPGDMP).

[8]  Amin Ansari,et al.  Mosaic: Exploiting the spatial locality of process variation to reduce refresh energy in on-chip eDRAM modules , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[9]  Chris Fallin,et al.  Flipping bits in memory without accessing them: An experimental study of DRAM disturbance errors , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[10]  Dae-Hyun Kim,et al.  Architectural Support for Mitigating Row Hammering in DRAM Memories , 2015, IEEE Computer Architecture Letters.

[11]  Rami G. Melhem,et al.  Leveraging ECC to Mitigate Read Disturbance, False Reads and Write Faults in STT-RAM , 2016, 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[12]  David Blaauw,et al.  16.3 A 23Mb/s 23pJ/b fully synthesized true-random-number generator in 28nm and 65nm CMOS , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[13]  Yan Solihin,et al.  Counter-based cache replacement algorithms , 2005, 2005 International Conference on Computer Design.

[14]  Kyungbae Park,et al.  Active-precharge hammering on a row induced failure in DDR3 SDRAMs under 3× nm technology , 2014, 2014 IEEE International Integrated Reliability Workshop Final Report (IIRW).

[15]  Robert H. Dennard,et al.  Challenges and future directions for the scaling of dynamic random-access memory (DRAM) , 2002, IBM J. Res. Dev..

[16]  Mohammad Arjomand,et al.  Boosting Access Parallelism to PCM-Based Main Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[17]  Y. Konishi,et al.  Analysis of coupling noise between adjacent bit lines in megabit DRAMs , 1989 .

[18]  Alessandro Trifiletti,et al.  A High-Speed Oscillator-Based Truly Random Number Source for Cryptographic Applications on a Smart Card IC , 2003, IEEE Trans. Computers.

[19]  Kazuaki Murakami,et al.  Optimizing the DRAM refresh count for merged DRAM/logic LSIs , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).

[20]  Alessandro Trifiletti,et al.  A high-speed IC random-number source for SmartCard microcontrollers , 2003 .

[21]  Jun Yang,et al.  Restore truncation for performance improvement in future DRAM systems , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[22]  Rami G. Melhem,et al.  Improving Bit Flip Reduction for Biased and Random Data , 2016, IEEE Transactions on Computers.

[23]  Wongyu Shin,et al.  DRAM-Latency Optimization Inspired by Relationship between Row-Access Time and Refresh Timing , 2016, IEEE Transactions on Computers.

[24]  Kyungbae Park,et al.  Experiments and root cause analysis for active-precharge hammering fault in DDR3 SDRAM under 3 × nm technology , 2016, Microelectron. Reliab..

[25]  N. Muralimanohar,et al.  CACTI 6 . 0 : A Tool to Understand Large Caches , 2007 .

[26]  Moinuddin K. Qureshi,et al.  A case for Refresh Pausing in DRAM memory systems , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[27]  Mahmut T. Kandemir,et al.  Hardware-Software Co-design to Mitigate DRAM Refresh Overheads: A Case for Refresh-Aware Process Scheduling , 2017, ASPLOS.

[28]  Rami G. Melhem,et al.  Mitigating bitline crosstalk noise in DRAM memories , 2017, MEMSYS.

[29]  Qingyuan Deng,et al.  MemScale: active low-power modes for main memory , 2011, ASPLOS XVI.

[30]  Yu Cao,et al.  Exploring sub-20nm FinFET design with Predictive Technology Models , 2012, DAC Design Automation Conference 2012.

[31]  Kinam Kim,et al.  Technology for sub-50nm DRAM and NAND flash manufacturing , 2005, IEEE InternationalElectron Devices Meeting, 2005. IEDM Technical Digest..

[32]  Rami G. Melhem,et al.  Enabling Fine-Grain Restricted Coset Coding Through Word-Level Compression for PCM , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[33]  Babak Falsafi,et al.  An Effective DRAM Cache Architecture for Scale-Out Servers , 2016 .

[34]  T. Schloesser,et al.  Challenges for the DRAM cell scaling to 40nm , 2005, IEEE InternationalElectron Devices Meeting, 2005. IEDM Technical Digest..

[35]  Ad J. van de Goor,et al.  Address and data scrambling: causes and impact on memory tests , 2002, Proceedings First IEEE International Workshop on Electronic Design, Test and Applications '2002.

[36]  Dam Sunwoo,et al.  Balancing DRAM locality and parallelism in shared memory CMP systems , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[37]  Marios C. Papaefthymiou,et al.  Block-based multiperiod dynamic memory design for low data-retention power , 2003, IEEE Trans. Very Large Scale Integr. Syst..

[38]  Reetuparna Das,et al.  ANVIL: Software-Based Protection Against Next-Generation Rowhammer Attacks , 2016, ASPLOS.

[39]  Daniel E. Holcomb,et al.  Refreshing Thoughts on DRAM : Power Saving vs . Data Integrity , 2014 .

[40]  Song Liu,et al.  Flikker: saving DRAM refresh-power through critical data partitioning , 2011, ASPLOS XVI.

[41]  Bruce Jacob,et al.  DRAM Refresh Mechanisms, Penalties, and Trade-Offs , 2016, IEEE Transactions on Computers.

[42]  Bruce Jacob,et al.  Memory Systems: Cache, DRAM, Disk , 2007 .

[43]  Stefan Mangard,et al.  Rowhammer.js: A Remote Software-Induced Fault Attack in JavaScript , 2015, DIMVA.

[44]  Herbert Bos,et al.  Dedup Est Machina: Memory Deduplication as an Advanced Exploitation Vector , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[45]  Rami G. Melhem,et al.  Counter-Based Tree Structure for Row Hammering Mitigation in DRAM , 2017, IEEE Computer Architecture Letters.