Efficient coding scheme for DDR4 memory subsystems

DRAMs face several major challenges: On the one hand, DRAM bit cells are leaky and must be refreshed periodically to ensure data integrity. Therefore, DRAM devices suffer from a large overhead due to refreshes both in terms of performance (available bandwidth) and power. On the other hand, reliability issues caused by technology shrinking are becoming a large concern. Thus, ECC techniques for DRAM errors, and especially for retention errors, gain more and more importance. In this paper, we present an investigation on DRAM errors and derive a detailed model for these types of errors. The model is verified by various measurements, and analyzed from an information theory point of view. Based on this model, a scheme is presented that largely improves DRAM's reliability with low overhead.

[1]  T. Schloesser,et al.  6F2 buried wordline DRAM cell for 40nm and beyond , 2008, 2008 IEEE International Electron Devices Meeting.

[2]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[3]  Samiha Mourad,et al.  Crosstalk in Deep Submicron DRAMs , 2000, MTDT.

[4]  Norbert Wehn,et al.  An analysis on retention error behavior and power consumption of recent DDR4 DRAMs , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[5]  Norbert Wehn,et al.  Improving the error behavior of DRAM by exploiting its Z-channel property , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[6]  Norbert Wehn,et al.  A Platform to Analyze DDR3 DRAM’s Power and Retention Time , 2017, IEEE Design & Test.

[7]  Sung-Kye Park,et al.  Technology Scaling Challenge and Future Prospects of DRAM and NAND Flash Memory , 2015, 2015 IEEE International Memory Workshop (IMW).

[8]  Jay M. Berger A Note on Error Detection Codes for Asymmetric Channels , 1961, Inf. Control..

[9]  Norbert Wehn,et al.  Retention time measurements and modelling of bit error rates of WIDE I/O DRAM in MPSoCs , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[10]  Chia-Lin Yang,et al.  SECRET: Selective error correction for refresh energy reduction in DRAMs , 2012, 2012 IEEE 30th International Conference on Computer Design (ICCD).

[11]  Yuan Xie,et al.  ProactiveDRAM: A DRAM-initiated retention management scheme , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).

[12]  Satoru Yamada,et al.  An Innovative Indicator to Evaluate DRAM Cell Transistor Leakage Current Distribution , 2018, IEEE Journal of the Electron Devices Society.

[13]  Norbert Wehn,et al.  Efficient reliability management in SoCs - an approximate DRAM perspective , 2016, 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC).

[14]  Richard W. Hamming,et al.  Error detecting and error correcting codes , 1950 .

[15]  Santosh K. Kurinec,et al.  Nanoscale Semiconductor Memories: Technology and Applications , 2013 .

[16]  Samiha Mourad,et al.  Crosstalk Induced Fault Analysis and Test in DRAMs , 2006, J. Electron. Test..

[17]  Jonghyuk Kim,et al.  23.2 A 5Gb/s/pin 8Gb LPDDR4X SDRAM with power-isolated LVSTL and split-die architecture with 2-die ZQ calibration scheme , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).

[18]  Rami G. Melhem,et al.  Refresh Now and Then , 2014, IEEE Transactions on Computers.

[19]  Kinam Kim,et al.  A New Investigation of Data Retention Time in Truly Nanoscaled DRAMs , 2009, IEEE Electron Device Letters.

[20]  Norbert Wehn,et al.  Reverse Engineering of DRAMs: Row Hammer with Crosshair , 2016, MEMSYS.

[21]  Onur Mutlu,et al.  An experimental study of data retention behavior in modern DRAM devices: implications for retention time profiling mechanisms , 2013, ISCA.

[22]  Onur Mutlu,et al.  The reach profiler (REAPER): Enabling the mitigation of DRAM retention failures via profiling at aggressive conditions , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[23]  Dae-Hyun Kim,et al.  ArchShield: architectural framework for assisting DRAM scaling by tolerating high error rates , 2013, ISCA.

[24]  L.G. Tallini,et al.  On the capacity and codes for the Z-channel , 2002, Proceedings IEEE International Symposium on Information Theory,.

[25]  Ding-Ming Kwai,et al.  An FPGA-based test platform for analyzing data retention time distribution of DRAMs , 2013, 2013 International Symposium onVLSI Design, Automation, and Test (VLSI-DAT).

[26]  Richard Veras,et al.  RAIDR: Retention-aware intelligent DRAM refresh , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[27]  Bruce Jacob,et al.  DRAM Refresh Mechanisms, Penalties, and Trade-Offs , 2016, IEEE Transactions on Computers.

[28]  Zhao Wu,et al.  Fault-tolerant refresh power reduction of DRAMs for quasi-nonvolatile data retention , 1999, Proceedings 1999 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (EFT'99).

[29]  M. Y. Hsiao,et al.  A class of optimal minimum odd-weight-column SEC-DED codes , 1970 .

[30]  J. Lucas,et al.  Sparkk : Quality-Scalable Approximate Storage in DRAM , 2014 .

[31]  Marco Ottavi,et al.  Characterization of data retention faults in DRAM devices , 2014, 2014 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT).

[32]  Rami G. Melhem,et al.  Mitigating bitline crosstalk noise in DRAM memories , 2017, MEMSYS.

[33]  Timothy J. Dell,et al.  A white paper on the benefits of chipkill-correct ecc for pc server main memory , 1997 .

[34]  Hyun-Soo Park,et al.  23.4 An extremely low-standby-power 3.733Gb/s/pin 2Gb LPDDR4 SDRAM for wearable devices , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).