Flipping bits in memory without accessing them: An experimental study of DRAM disturbance errors

Memory isolation is a key property of a reliable and secure computing system-an access to one memory address should not have unintended side effects on data stored in other addresses. However, as DRAM process technology scales down to smaller dimensions, it becomes more difficult to prevent DRAM cells from electrically interacting with each other. In this paper, we expose the vulnerability of commodity DRAM chips to disturbance errors. By reading from the same address in DRAM, we show that it is possible to corrupt data in nearby addresses. More specifically, activating the same row in DRAM corrupts data in nearby rows. We demonstrate this phenomenon on Intel and AMD systems using a malicious program that generates many DRAM accesses. We induce errors in most DRAM modules (110 out of 129) from three major DRAM manufacturers. From this we conclude that many deployed systems are likely to be at risk. We identify the root cause of disturbance errors as the repeated toggling of a DRAM row's wordline, which stresses inter-cell coupling effects that accelerate charge leakage from nearby rows. We provide an extensive characterization study of disturbance errors and their behavior using an FPGA-based testing platform. Among our key findings, we show that (i) it takes as few as 139K accesses to induce an error and (ii) up to one in every 1.7K cells is susceptible to errors. After examining various potential ways of addressing the problem, we propose a low-overhead solution to prevent the errors.

[1]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[2]  Robert H. Morris,et al.  Counting large numbers of events in small registers , 1978, CACM.

[3]  K Marrin,et al.  Semiconductor memory , 1986 .

[4]  H. Hidaka,et al.  A Twisted Bit Line Technique for Multi-Mb Drams , 1988, 1988 IEEE International Solid-State Circuits Conference, 1988 ISSCC. Digest of Technical Papers.

[5]  Y. Konishi,et al.  Analysis of coupling noise between adjacent bit lines in megabit DRAMs , 1989 .

[6]  Dong-Sun Min Dong-Sun Min,et al.  Wordline coupling noise reduction techniques for scaled DRAMs , 1990, Digest of Technical Papers., 1990 Symposium on VLSI Circuits.

[7]  Hiroki Koike,et al.  A 30-ns 64-Mb DRAM with built-in self-test and self-repair function , 1992 .

[8]  Industrial evaluation of DRAM tests , 1999, Design, Automation and Test in Europe Conference and Exhibition, 1999. Proceedings (Cat. No. PR00078).

[9]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[10]  Chenming Hu,et al.  Impact of gate-induced drain leakage current on the tail distribution of DRAM data retention time , 2000, International Electron Devices Meeting 2000. Technical Digest. IEDM (Cat. No.00CH37138).

[11]  Feng Lin,et al.  DRAM circuit design , 2000 .

[12]  Bruce F. Cockburn,et al.  An investigation into crosstalk noise in DRAM structures , 2002, Proceedings of the 2002 IEEE International Workshop on Memory Technology, Design and Testing (MTDT2002).

[13]  Robert H. Dennard,et al.  Challenges and future directions for the scaling of dynamic random-access memory (DRAM) , 2002, IBM J. Res. Dev..

[14]  Ad J. van de Goor,et al.  Address and data scrambling: causes and impact on memory tests , 2002, Proceedings First IEEE International Workshop on Electronic Design, Test and Applications '2002.

[15]  Saibal Mukhopadhyay,et al.  Leakage current mechanisms and leakage reduction techniques in deep-submicrometer CMOS circuits , 2003, Proc. IEEE.

[16]  Roberto Bez,et al.  Introduction to flash memory , 2003, Proc. IEEE.

[17]  Mason L. Williams,et al.  Cross-track noise profile measurement for adjacent-track interference study and write-current optimization in perpendicular recording , 2003 .

[18]  Yossi Matias,et al.  Spectral bloom filters , 2003, SIGMOD '03.

[19]  Richard M. Karp,et al.  A simple algorithm for finding frequent elements in streams and bags , 2003, TODS.

[20]  유쿠타케세이고,et al.  A semiconductor memory , 2004 .

[21]  Kaushik Roy,et al.  Modeling and testing of SRAM for new failure mechanisms due to process variations in nanoscale CMOS , 2005, 23rd IEEE VLSI Test Symposium (VTS'05).

[22]  Zaid Al-Ars DRAM fault analysis and test generation , 2005 .

[23]  Eric Rotenberg,et al.  Retention-aware placement in DRAM (RAPID): software methods for quasi-non-volatile DRAM , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[24]  Said Hamdioui,et al.  DRAM-Specific Space of Memory Tests , 2006, 2006 IEEE International Test Conference.

[25]  Dong Tang,et al.  Assessment of the Effect of Memory Page Retirement on System RAS Against Hardware Faults , 2006, International Conference on Dependable Systems and Networks (DSN'06).

[26]  Bianca Schroeder,et al.  Disk Failures in the Real World: What Does an MTTF of 1, 000, 000 Hours Mean to You? , 2007, FAST.

[27]  Eduardo Pinheiro,et al.  Failure Trends in a Large Disk Drive Population , 2007, FAST.

[28]  Feng Lin,et al.  DRAM Circuit Design: Fundamental and High-Speed Topics , 2007 .

[29]  J. Zhu,et al.  Understanding Adjacent Track Erasure in Discrete Track Media , 2008, IEEE Transactions on Magnetics.

[30]  Paul H. Siegel,et al.  Characterizing flash memory: Anomalies, observations, and applications , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[31]  Benjamin Van Durme,et al.  Probabilistic Counting with Randomized Storage , 2009, IJCAI.

[32]  A. Kavcic,et al.  The Feasibility of Magnetic Recording at 10 Terabits Per Square Inch on Conventional Media , 2009, IEEE Transactions on Magnetics.

[33]  Borivoje Nikolic,et al.  Large-Scale SRAM Variability Characterization in 45 nm CMOS , 2009, IEEE Journal of Solid-State Circuits.

[34]  Rei-Fu Huang,et al.  Fault models for embedded-DRAM macros , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[35]  Jerome H. Saltzer,et al.  Principles of Computer System Design: An Introduction , 2009 .

[36]  Shi-Jie Wen,et al.  New DRAM HCI qualification method emphasizing on repeated memory access , 2010, 2010 IEEE International Integrated Reliability Workshop Final Report.

[37]  Masashi Horiguchi,et al.  Nanoscale Memory Repair , 2011, Integrated Circuits and Systems.

[38]  David Blaauw,et al.  Variation-aware static and dynamic writability analysis for voltage-scaled bit-interleaved 8-T SRAMs , 2011, IEEE/ACM International Symposium on Low Power Electronics and Design.

[39]  Richard Veras,et al.  RAIDR: Retention-aware intelligent DRAM refresh , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[40]  Rei-Fu Huang,et al.  Alternate hammering test for application-specific DRAMs and an industrial case study , 2012, DAC Design Automation Conference 2012.

[41]  Dam Sunwoo,et al.  Balancing DRAM locality and parallelism in shared memory CMP systems , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[42]  Lei Liu,et al.  A software memory partition approach for eliminating bank-level interference in multicore systems , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[43]  Onur Mutlu,et al.  Error patterns in MLC NAND flash memory: Measurement, characterization, and analysis , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[44]  Onur Mutlu,et al.  A case for exploiting subarray-level parallelism (SALP) in DRAM , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[45]  Björn Andersson,et al.  Coordinated Bank and Cache Coloring for Temporal Protection of Memory Accesses , 2013, 2013 IEEE 16th International Conference on Computational Science and Engineering.

[46]  Dae-Hyun Kim,et al.  ArchShield: architectural framework for assisting DRAM scaling by tolerating high error rates , 2013, ISCA.

[47]  Onur Mutlu,et al.  An experimental study of data retention behavior in modern DRAM devices: implications for retention time profiling mechanisms , 2013, ISCA.

[48]  Onur Mutlu,et al.  Program interference in MLC NAND flash memory: Characterization, modeling, and mitigation , 2013, ICCD.

[49]  Onur Mutlu,et al.  Tiered-latency DRAM: A low latency and low cost DRAM architecture , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[50]  Onur Mutlu,et al.  Memory scaling: A systems architecture perspective , 2013, 2013 5th IEEE International Memory Workshop.

[51]  Onur Mutlu,et al.  The efficacy of error mitigation techniques for DRAM retention failures: a comparative experimental study , 2014, SIGMETRICS '14.