Reliability Issues in Flash-Memory-Based Solid-State Drives: Experimental Analysis, Mitigation, Recovery

NAND flash memory is ubiquitous in everyday life today because its capacity has continuously increased and cost has continuously decreased over decades. This positive growth is a result of two key trends: (1) effective process technology scaling; and (2) multi-level (e.g., MLC, TLC) cell data coding. Unfortunately, the reliability of raw data stored in flash memory has also continued to become more difficult to ensure, because these two trends lead to (1) fewer electrons in the flash memory cell floating gate to represent the data; and (2) larger cell-to-cell interference and disturbance effects. Without mitigation, worsening reliability can reduce the lifetime of NAND flash memory. As a result, flash memory controllers in solid-state drives (SSDs) have become much more sophisticated: they incorporate many effective techniques to ensure the correct interpretation of noisy data stored in flash memory cells. In this chapter, we review recent advances in SSD error characterization, mitigation, and data recovery techniques for reliability and lifetime improvement. We provide rigorous experimental data from state-of-the-art MLC and TLC NAND flash devices on various types of flash memory errors, to motivate the need for such techniques. Based on the understanding developed by the experimental characterization, we describe several mitigation and recovery techniques, including (1) cell-to-cell interference mitigation; (2) optimal multi-level cell sensing; (3) error correction using state-of-the-art algorithms and methods; and (4) data recovery when error correction fails. We quantify the reliability improvement provided by each of these techniques. Looking forward, we briefly discuss how flash memory and these techniques could evolve into the future.

[1]  Roberto Bez,et al.  Introduction to flash memory , 2003, Proc. IEEE.

[2]  Song Liu,et al.  Flikker: saving DRAM refresh-power through critical data partitioning , 2011, ASPLOS XVI.

[3]  Ming Zhao,et al.  How Much Can Data Compressibility Help to Improve NAND Flash Memory Lifetime? , 2015, FAST.

[4]  Chin-Long Chen,et al.  High-speed decoding of BCH codes , 1981, IEEE Trans. Inf. Theory.

[5]  D. Yaney,et al.  A meta-stable leakage phenomenon in DRAM charge storage —Variable hold time , 1987, 1987 International Electron Devices Meeting.

[6]  Onur Mutlu,et al.  Program interference in MLC NAND flash memory: Characterization, modeling, and mitigation , 2013, ICCD.

[7]  Herbert Bos,et al.  Dedup Est Machina: Memory Deduplication as an Advanced Exploitation Vector , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[8]  Sungjin Lee,et al.  Improving performance and lifetime of NAND storage systems using relaxed program sequence , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[9]  Mor Harchol-Balter,et al.  ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[10]  Jin Sun,et al.  Utility-Based Hybrid Memory Management , 2017, 2017 IEEE International Conference on Cluster Computing (CLUSTER).

[11]  Onur Mutlu,et al.  Tiered-latency DRAM: A low latency and low cost DRAM architecture , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[12]  Onur Mutlu,et al.  Detecting and Mitigating Data-Dependent DRAM Failures by Exploiting Current Memory Content , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[13]  Sivan Toledo,et al.  Compression and SSDs: Where and How? , 2014, INFLOW.

[14]  Onur Mutlu,et al.  Using ECC DRAM to Adaptively Increase Memory Capacity , 2017, ArXiv.

[15]  Sungjin Lee,et al.  Lifetime improvement of NAND flash-based storage systems using dynamic program and erase scaling , 2014, FAST.

[16]  Zhe Zhang,et al.  Memory module-level testing and error behaviors for phase change memory , 2012, 2012 IEEE 30th International Conference on Computer Design (ICCD).

[17]  Achilleas Anastasopoulos,et al.  A comparison between the sum-product and the min-sum iterative detection algorithms based on density evolution , 2001, GLOBECOM'01. IEEE Global Telecommunications Conference (Cat. No.01CH37270).

[18]  Onur Mutlu,et al.  BLISS: Balancing Performance, Fairness and Complexity in Memory Access Scheduling , 2016, IEEE Transactions on Parallel and Distributed Systems.

[19]  Nanning Zheng,et al.  LDPC-in-SSD: making advanced error correction codes work effectively in solid state drives , 2013, FAST.

[20]  Osman S. Unsal,et al.  Flash correct-and-refresh: Retention-aware error management for increased flash memory lifetime , 2012, 2012 IEEE 30th International Conference on Computer Design (ICCD).

[21]  Seokkiu Lee,et al.  Highly reliable 26nm 64Gb MLC E2NAND (Embedded-ECC & Enhanced-efficiency) flash memory with MSP (Memory Signal Processing) controller , 2011, 2011 Symposium on VLSI Technology - Digest of Technical Papers.

[22]  Mahmut T. Kandemir,et al.  Evaluating STT-RAM as an energy-efficient main memory alternative , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[23]  Rina Panigrahy,et al.  Design Tradeoffs for SSD Performance , 2008, USENIX ATC.

[24]  Aamer Jaleel,et al.  BEAR: Techniques for mitigating bandwidth bloat in gigascale DRAM caches , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[25]  Dwijendra K. Ray-Chaudhuri,et al.  Binary mixture flow with free energy lattice Boltzmann methods , 2022, arXiv.org.

[26]  S. Arrhenius Über die Dissociationswärme und den Einfluss der Temperatur auf den Dissociationsgrad der Elektrolyte , 1889 .

[27]  Norbert Wehn,et al.  Exploiting expendable process-margins in DRAMs for run-time performance optimization , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[28]  Wei Liu,et al.  VLSI Implementation of BCH Error Correction for Multilevel Cell NAND Flash Memory , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[29]  Moinuddin K. Qureshi,et al.  Reducing Refresh Power in Mobile Devices with Morphable ECC , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[30]  Robert T. Chien,et al.  Cyclic decoding procedures for Bose- Chaudhuri-Hocquenghem codes , 1964, IEEE Trans. Inf. Theory.

[31]  Onur Mutlu,et al.  Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems , 2008, 2008 International Symposium on Computer Architecture.

[32]  José F. Martínez,et al.  Improving memory scheduling via processor-side load criticality information , 2013, ISCA.

[33]  Young-Ho Lim,et al.  A 3.3 V 32 Mb NAND flash memory with incremental step pulse programming scheme , 1995 .

[34]  Jongmoo Choi,et al.  Decoupled Direct Memory Access: Isolating CPU and IO Traffic by Leveraging a Dual-Data-Port DRAM , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).

[35]  Kai Li,et al.  RIPQ: Advanced Photo Caching on Flash for Facebook , 2015, FAST.

[36]  Shuhei Tanakamaru,et al.  95%-lower-BER 43%-lower-power intelligent solid-state drive (SSD) with asymmetric coding and stripe pattern elimination algorithm , 2011, 2011 IEEE International Solid-State Circuits Conference.

[37]  Tong Zhang,et al.  Quasi-nonvolatile SSD: Trading flash memory nonvolatility to improve storage system performance for enterprise applications , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[38]  Onur Mutlu,et al.  Read Disturb Errors in MLC NAND Flash Memory: Characterization, Mitigation, and Recovery , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[39]  Cheng-Wen Wu,et al.  An Adaptive-Rate Error Correction Scheme for NAND Flash Memory , 2009, 2009 27th IEEE VLSI Test Symposium.

[40]  Onur Mutlu,et al.  The RowHammer problem and other issues we may face as memory becomes denser , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[41]  Wei Wu,et al.  Energy-efficient cache design using variable-strength error-correcting codes , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[42]  Onur Mutlu,et al.  Understanding Latency Variation in Modern DRAM Chips: Experimental Characterization, Analysis, and Optimization , 2016, SIGMETRICS.

[43]  Yohwan Koh,et al.  NAND Flash Scaling Beyond 20nm , 2009, 2009 IEEE International Memory Workshop.

[44]  Luca Crippa,et al.  A 4Gb 2b/cell NAND Flash Memory with Embedded 5b BCH ECC for 36MB/s System Read Throughput , 2006, 2006 IEEE International Solid State Circuits Conference - Digest of Technical Papers.

[45]  Tao Xie,et al.  Understanding the impact of threshold voltage on MLC flash memory performance and reliability , 2014, ICS '14.

[46]  Mrinmoy Ghosh,et al.  Performance Characterization of Hyperscale Applicationson on NVMe SSDs , 2015, SIGMETRICS.

[47]  Onur Mutlu,et al.  Simultaneous Multi-Layer Access , 2016, ACM Trans. Archit. Code Optim..

[48]  Kinam Kim,et al.  Degradation of tunnel oxide by FN current stress and its effects on data retention characteristics of 90 nm NAND flash memory cells , 2003, 2003 IEEE International Reliability Physics Symposium Proceedings, 2003. 41st Annual..

[49]  Zhen Fang,et al.  Leveraging Heterogeneity in DRAM Main Memories to Accelerate Critical Word Access , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[50]  Young-Hyun Jun,et al.  A new 3-bit programming algorithm using SLC-to-TLC migration for 8MB/s high performance TLC NAND flash memory , 2012, 2012 Symposium on VLSI Circuits (VLSIC).

[51]  Lizy Kurian John,et al.  Elastic Refresh: Techniques to Mitigate Refresh Penalties in High Density Memory , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[52]  Bianca Schroeder,et al.  Cosmic rays don't strike twice: understanding the nature of DRAM errors and the implications for system design , 2012, ASPLOS XVII.

[53]  Jihong Kim,et al.  An Integrated Approach for Managing Read Disturbs in High-Density NAND Flash Memory , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[54]  Sai Prashanth Muralidhara,et al.  Reducing memory interference in multicore systems via application-aware memory channel partitioning , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[55]  Jungdal Choi,et al.  Effects of floating-gate interference on NAND flash memory cell operation , 2002 .

[56]  Onur Mutlu,et al.  AVATAR: A Variable-Retention-Time (VRT) Aware Refresh for DRAM Systems , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[57]  James L. Massey,et al.  Shift-register synthesis and BCH decoding , 1969, IEEE Trans. Inf. Theory.

[58]  Sungho Kang,et al.  Data Randomization Scheme for Endurance Enhancement and Interference Mitigation of Multilevel Flash Memory Devices , 2013 .

[59]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[60]  Mattan Erez,et al.  Bamboo ECC: Strong, safe, and flexible codes for reliable computer memory , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[61]  Onur Mutlu,et al.  Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[62]  Elwyn R. Berlekamp Nonbinary BCH decoding (Abstr.) , 1968, IEEE Trans. Inf. Theory.

[63]  Onur Mutlu,et al.  Invited: Who is the major threat to tomorrow's security? You, the hardware designer , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[64]  Sriram Sankar,et al.  reFresh SSDs: Enabling High Endurance, Low Cost Flash in Datacenters , 2012 .

[65]  Onur Mutlu,et al.  Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[66]  J. W. Park,et al.  DRAM variable retention time , 1992, 1992 International Technical Digest on Electron Devices Meeting.

[67]  Onur Mutlu,et al.  Understanding Reduced-Voltage Operation in Modern DRAM Devices , 2017, Proc. ACM Meas. Anal. Comput. Syst..

[68]  K. Sakui,et al.  A negative V/sub th/ cell architecture for highly scalable, excellently noise-immune, and highly reliable NAND flash memories , 1999 .

[69]  Chundong Wang,et al.  Extending the lifetime of NAND flash memory by salvaging bad blocks , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[70]  Eric Rotenberg,et al.  Retention-aware placement in DRAM (RAPID): software methods for quasi-non-volatile DRAM , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[71]  Peter Desnoyers,et al.  Analytic modeling of SSD write performance , 2012, SYSTOR '12.

[72]  John Shalf,et al.  Memory Errors in Modern Systems: The Good, The Bad, and The Ugly , 2015, ASPLOS.

[73]  Onur Mutlu,et al.  Memory scaling: A systems architecture perspective , 2013, 2013 5th IEEE International Memory Workshop.

[74]  Norman P. Jouppi,et al.  LOT-ECC: Localized and tiered reliability mechanisms for commodity memory systems , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[75]  Rino Micheloni,et al.  BCH and LDPC Error Correction Codes for NAND Flash Memories , 2016, 3D Flash Memories.

[76]  Vincent Rijmen,et al.  The Design of Rijndael , 2002, Information Security and Cryptography.

[77]  Sivan Toledo,et al.  Algorithms and data structures for flash memories , 2005, CSUR.

[78]  D. Ielmini,et al.  Recovery and Drift Dynamics of Resistance and Threshold Voltages in Phase-Change Memories , 2007, IEEE Transactions on Electron Devices.

[79]  Osman S. Unsal,et al.  Neighbor-cell assisted error correction for MLC NAND flash memories , 2014, SIGMETRICS '14.

[80]  Mor Harchol-Balter,et al.  Thread Cluster Memory Scheduling: Exploiting Differences in Memory Access Behavior , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[81]  S. Luryi,et al.  Charge injection transistor based on real-space hot-electron transfer , 1984, IEEE Transactions on Electron Devices.

[82]  Y. Fukuzumi,et al.  Disturbless flash memory due to high boost efficiency on BiCS structure and optimal memory film stack for ultra high density storage device , 2008, 2008 IEEE International Electron Devices Meeting.

[83]  J. E. Brewer,et al.  Nonvolatile Memory Technologies with Emphasis on Flash: A Comprehensive Guide to Understanding and Using Flash Memory Devices , 2008 .

[84]  Onur Mutlu,et al.  Efficient Data Mapping and Buffering Techniques for Multilevel Cell Phase-Change Memories , 2014, ACM Trans. Archit. Code Optim..

[85]  Chilhee Chung,et al.  New scaling limitation of the floating gate cell in NAND Flash Memory , 2010, 2010 IEEE International Reliability Physics Symposium.

[86]  Wook-Ghee Hahn,et al.  7.2 A 128Gb 3b/cell V-NAND flash memory with 1Gb/s I/O rate , 2015, 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers.

[87]  Yuval Yarom,et al.  Another Flip in the Wall of Rowhammer Defenses , 2017, 2018 IEEE Symposium on Security and Privacy (SP).

[88]  Onur Mutlu,et al.  Ramulator: A Fast and Extensible DRAM Simulator , 2016, IEEE Computer Architecture Letters.

[89]  Mohammed Atiquzzaman,et al.  VLSI Architectures for Layered Decoding for Irregular LDPC Codes of WiMax , 2007, 2007 IEEE International Conference on Communications.

[90]  Onur Mutlu,et al.  MISE: Providing performance predictability and improving fairness in shared main memory systems , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[91]  O. Mutlu,et al.  Enabling Accurate and Practical Online Flash Channel Modeling for Modern MLC NAND Flash Memory , 2016, IEEE Journal on Selected Areas in Communications.

[92]  Jeong-Don Ihm,et al.  7.1 256Gb 3b/cell V-NAND flash memory with 48 stacked WL layers , 2016, 2016 IEEE International Solid-State Circuits Conference (ISSCC).

[93]  Aamer Jaleel,et al.  CAMEO: A Two-Level Memory Organization with Capacity of Main Memory and Flexibility of Hardware-Managed Cache , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[94]  Jongmoo Choi,et al.  WARM: Improving NAND flash memory lifetime with write-hotness aware retention management , 2015, 2015 31st Symposium on Mass Storage Systems and Technologies (MSST).

[95]  Onur Mutlu,et al.  Error Characterization, Mitigation, and Recovery in Flash-Memory-Based Solid-State Drives , 2017, Proceedings of the IEEE.

[96]  Yanick Fratantonio,et al.  Drammer: Deterministic Rowhammer Attacks on Mobile Platforms , 2016, CCS.

[97]  Kinam Kim,et al.  A New Investigation of Data Retention Time in Truly Nanoscaled DRAMs , 2009, IEEE Electron Device Letters.

[98]  Dong Woo Kim,et al.  Vertical cell array using TCAT(Terabit Cell Array Transistor) technology for ultra high density NAND flash memory , 2006, 2009 Symposium on VLSI Technology.

[99]  Rachata Ausavarungnirun,et al.  Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms , 2017, SIGMETRICS.

[100]  Nikolaos Papandreou,et al.  Using adaptive read voltage thresholds to enhance the reliability of MLC NAND flash memory systems , 2014, GLSVLSI '14.

[101]  Sung-Soo Lee,et al.  A 7MB/s 64Gb 3-bit/cell DDR NAND flash memory in 20nm-node technology , 2011, 2011 IEEE International Solid-State Circuits Conference.

[102]  Rami G. Melhem,et al.  Refresh Now and Then , 2014, IEEE Transactions on Computers.

[103]  Yuan Xiao,et al.  One Bit Flips, One Cloud Flops: Cross-VM Row Hammer Attacks and Privilege Escalation , 2016, USENIX Security Symposium.

[104]  Onur Mutlu,et al.  A Case for Memory Content-Based Detection and Mitigation of Data-Dependent Failures in DRAM , 2017, IEEE Computer Architecture Letters.

[105]  Onur Mutlu,et al.  Self-Optimizing Memory Controllers: A Reinforcement Learning Approach , 2008, 2008 International Symposium on Computer Architecture.

[106]  Onur Mutlu,et al.  Adaptive-latency DRAM: Optimizing DRAM timing for the common-case , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[107]  Onur Mutlu,et al.  The reach profiler (REAPER): Enabling the mitigation of DRAM retention failures via profiling at aggressive conditions , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[108]  Y. Mori,et al.  Analysis of detrap current due to oxide traps to improve flash memory retention , 2000, 2000 IEEE International Reliability Physics Symposium Proceedings. 38th Annual (Cat. No.00CH37059).

[109]  Youjip Won,et al.  I/O Stack Optimization for Smartphones , 2013, USENIX ATC.

[110]  Onur Mutlu,et al.  An experimental study of data retention behavior in modern DRAM devices: implications for retention time profiling mechanisms , 2013, ISCA.

[111]  Yixin Luo,et al.  Improving the reliability of chip-off forensic analysis of NAND flash memory devices , 2017, Digit. Investig..

[112]  Sung-Jin Choi,et al.  Comprehensive evaluation of early retention (fast charge loss within a few seconds) characteristics in tube-type 3-D NAND flash memory , 2016, 2016 IEEE Symposium on VLSI Technology.

[113]  Wei Wu,et al.  Reducing cache power with low-cost, multi-bit error-correcting codes , 2010, ISCA.

[114]  Mingzhen Xu,et al.  Extended Arrhenius law of time-to-breakdown of ultrathin gate oxides , 2003 .

[115]  Onur Mutlu,et al.  Threshold voltage distribution in MLC NAND flash memory: Characterization, analysis, and modeling , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[116]  D. Stewart,et al.  The missing memristor found , 2008, Nature.

[117]  Thomas P. Parnell,et al.  Modelling of the threshold voltage distributions of sub-20nm NAND flash memory , 2014, 2014 IEEE Global Communications Conference.

[118]  S. Sze,et al.  A floating gate and its application to memory devices , 1967 .

[119]  Zili Shao,et al.  MNFTL: An efficient flash translation layer for MLC NAND flash memory storage systems , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).

[120]  Luca Crippa,et al.  Array Architectures for 3-D NAND Flash Memories , 2017, Proceedings of the IEEE.

[121]  Onur Mutlu,et al.  A Case for Effic ient Hardware/Soft ware Cooperative Management of Storage and Memory , 2013 .

[122]  Paul H. Siegel,et al.  Characterizing flash memory: Anomalies, observations, and applications , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[123]  Qiang Wu,et al.  A Large-Scale Study of Flash Memory Failures in the Field , 2015, SIGMETRICS 2015.

[124]  Onur Mutlu,et al.  Improving 3D NAND Flash Memory Lifetime by Tolerating Early Retention Loss and Process Variation , 2018, SIGMETRICS.

[125]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[126]  Y. Iwata,et al.  Pipe-shaped BiCS flash memory with 16 stacked layers and multi-level-cell operation for ultra high density storage devices , 2006, 2009 Symposium on VLSI Technology.

[127]  Andrea C. Arpaci-Dusseau,et al.  The Unwritten Contract of Solid State Drives , 2017, EuroSys.

[128]  William Ryan,et al.  Channel Codes by William Ryan , 2009 .

[129]  Kevin K. Chang,et al.  Understanding and Improving the Latency of DRAM-Based Memory Systems , 2017, ArXiv.

[130]  Seiichi Aritome,et al.  Data-Retention Characteristics Comparison of 2D and 3D TLC NAND Flash Memories , 2017, 2017 IEEE International Memory Workshop (IMW).

[131]  Jun Yang,et al.  Phase-Change Technology and the Future of Main Memory , 2010, IEEE Micro.

[132]  Hong Jiang,et al.  Performance impact and interplay of SSD parallelism through advanced commands, allocation strategy and data granularity , 2011, ICS '11.

[133]  Onur Mutlu,et al.  Improving DRAM performance by parallelizing refreshes with accesses , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[134]  Onur Mutlu,et al.  Improving memory Bank-Level Parallelism in the presence of prefetching , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[135]  Jun Yang,et al.  Mitigating Write Disturbance in Super-Dense Phase Change Memories , 2014, 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[136]  Lara Dolecek,et al.  Spatially-aware adaptive error correcting codes for flash memory , 2011, 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[137]  Alessandro Calderoni,et al.  Challenges for high-density 16Gb ReRAM with 27nm technology , 2015 .

[138]  Ken Mai,et al.  FPGA-Based Solid-State Drive Prototyping Platform , 2011, 2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines.

[139]  S. Phadke,et al.  MLP aware heterogeneous memory system , 2011, 2011 Design, Automation & Test in Europe.

[140]  Luca Crippa,et al.  3D NAND Flash Memories , 2018 .

[141]  Vijayalakshmi Srinivasan,et al.  Scalable high performance main memory system using phase-change memory technology , 2009, ISCA '09.

[142]  Mrinmoy Ghosh,et al.  Performance analysis of NVMe SSDs and their implication on real world databases , 2015, SYSTOR.

[143]  Onur Mutlu,et al.  The DRAM Latency PUF: Quickly Evaluating Physical Unclonable Functions by Exploiting the Latency-Reliability Tradeoff in Modern Commodity DRAM Devices , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[144]  Hideki Imai,et al.  Reduced complexity iterative decoding of low-density parity check codes based on belief propagation , 1999, IEEE Trans. Commun..

[145]  Yu Cai NAND flash memory: Characterization, analysis, modelling, and mechanisms , 2012 .

[146]  Robert Michael Tanner,et al.  A recursive approach to low complexity codes , 1981, IEEE Trans. Inf. Theory.

[147]  Onur Mutlu,et al.  Enabling Efficient and Scalable Hybrid Memories Using Fine-Granularity DRAM Cache Management , 2012, IEEE Computer Architecture Letters.

[148]  Tao Li,et al.  Exploring Phase Change Memory and 3D Die-Stacking for Power/Thermal Friendly, Fast and Durable Memory Architectures , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.

[149]  Richard D. Wesel,et al.  Enhanced Precision Through Multiple Reads for LDPC Decoding in Flash Memories , 2013, IEEE Journal on Selected Areas in Communications.

[150]  Jun Yang,et al.  A durable and energy efficient main memory using phase change memory technology , 2009, ISCA '09.

[151]  Onur Mutlu,et al.  The Blacklisting Memory Scheduler: Achieving high performance and fairness at low cost , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).

[152]  Onur Mutlu,et al.  The efficacy of error mitigation techniques for DRAM retention failures: a comparative experimental study , 2014, SIGMETRICS '14.

[153]  Yeong-Taek Lee,et al.  A Zeroing Cell-to-Cell Interference Page Architecture With Temporary LSB Storing and Parallel MSB Program Scheme for MLC NAND Flash Memories , 2008, IEEE Journal of Solid-State Circuits.

[154]  A. Pirovano,et al.  Low-field amorphous state resistance and threshold voltage drift in chalcogenide materials , 2004, IEEE Transactions on Electron Devices.

[155]  Richard Veras,et al.  RAIDR: Retention-aware intelligent DRAM refresh , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[156]  Bruce Jacob,et al.  The performance of PC solid-state disks (SSDs) as a function of bandwidth, concurrency, device architecture, and system organization , 2009, ISCA '09.

[157]  Youngjae Kim,et al.  DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings , 2009, ASPLOS.

[158]  Onur Mutlu,et al.  Error patterns in MLC NAND flash memory: Measurement, characterization, and analysis , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[159]  Tao Li,et al.  Informed Microarchitecture Design Space Exploration Using Workload Dynamics , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[160]  Ricardo Bianchini,et al.  Page placement in hybrid memory systems , 2011, ICS '11.

[161]  Mattan Erez,et al.  Frugal ECC: efficient and versatile memory error protection through fine-grained compression , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[162]  Sang-Won Lee,et al.  A survey of Flash Translation Layer , 2009, J. Syst. Archit..

[163]  Herbert Bos,et al.  Flip Feng Shui: Hammering a Needle in the Software Stack , 2016, USENIX Security Symposium.

[164]  Heeseung Jo,et al.  A superblock-based flash translation layer for NAND flash memory , 2006, EMSOFT '06.

[165]  Mircea R. Stan,et al.  How I Learned to Stop Worrying and Love Flash Endurance , 2010, HotStorage.

[166]  Onur Mutlu,et al.  Memory Performance Attacks: Denial of Memory Service in Multi-Core Systems , 2007, USENIX Security Symposium.

[167]  R. E. Oleksiak,et al.  The variable threshold transistor, a new electrically-alterable, non-destructive read-only storage device , 1967 .

[168]  Onur Mutlu,et al.  PARBOR: An Efficient System-Level Technique to Detect Data-Dependent Failures in DRAM , 2016, 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[169]  Jonghoon Park,et al.  11.4 A 512Gb 3b/cell 64-stacked WL 3D V-NAND flash memory , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).

[170]  William Ryan,et al.  Channel Codes: Classical and Modern , 2009 .

[171]  Alessandro Calderoni,et al.  A copper ReRAM cell for Storage Class Memory applications , 2014, 2014 Symposium on VLSI Technology (VLSI-Technology): Digest of Technical Papers.

[172]  Xubin He,et al.  Reducing SSD read latency via NAND flash program and erase suspension , 2012, FAST.

[173]  Evangelos Eleftheriou,et al.  Write amplification analysis in flash-based solid state drives , 2009, SYSTOR '09.

[174]  Eduardo Pinheiro,et al.  DRAM errors in the wild: a large-scale field study , 2009, SIGMETRICS '09.

[175]  Sudhanva Gurumurthi,et al.  Feng Shui of supercomputer memory positional effects in DRAM and SRAM faults , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[176]  Judea Pearl,et al.  Reverend Bayes on Inference Engines: A Distributed Hierarchical Approach , 1982, AAAI.

[177]  Tony Givargis,et al.  Deterministic service guarantees for nand flash using partial block cleaning , 2008, CODES+ISSS '08.

[178]  In-Cheol Park,et al.  6.4Gb/s multi-threaded BCH encoder and decoder for multi-channel SSD controllers , 2012, 2012 IEEE International Solid-State Circuits Conference.

[179]  Srinivas Devadas,et al.  Banshee: Bandwidth-Efficient DRAM Caching via Software/Hardware Cooperation , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[180]  M. Wada,et al.  Stress induced leakage current limiting to scale down EEPROM tunnel oxide thickness , 1988, Technical Digest., International Electron Devices Meeting.

[181]  José F. Martínez,et al.  Understanding and mitigating refresh overheads in high-density DDR4 DRAM systems , 2013, ISCA.

[182]  Ke Zhou,et al.  FlexECC: Partially Relaxing ECC of MLC SSD for Better Cache Performance , 2014, USENIX Annual Technical Conference.

[183]  Jie Liu,et al.  SSD Failures in Datacenters: What, When and Why? , 2016, SIGMETRICS.

[184]  Wei Liu,et al.  Low-Power High-Throughput BCH Error Correction VLSI Design for Multi-Level Cell NAND Flash Memories , 2006, 2006 IEEE Workshop on Signal Processing Systems Design and Implementation.

[185]  W. W. Peterson,et al.  Cyclic Codes for Error Detection , 1961, Proceedings of the IRE.

[186]  Jinghu Chen,et al.  Near optimum universal belief propagation based decoding of low-density parity check codes , 2002, IEEE Trans. Commun..

[187]  Onur Mutlu,et al.  Distributed order scheduling and its application to multi-core dram controllers , 2008, PODC '08.

[188]  Young-Hyun Jun,et al.  A 21 nm High Performance 64 Gb MLC NAND Flash Memory With 400 MB/s Asynchronous Toggle DDR Interface , 2012, IEEE Journal of Solid-State Circuits.

[189]  Tei-Wei Kuo,et al.  Real-time garbage collection for flash-memory storage systems of real-time embedded systems , 2004, TECS.

[190]  R. Degraeve,et al.  Analytical percolation model for predicting anomalous charge loss in flash memories , 2004, IEEE Transactions on Electron Devices.

[191]  Mahmut T. Kandemir,et al.  ZombieNAND: Resurrecting Dead NAND Flash for Improved SSD Longevity , 2014, 2014 IEEE 22nd International Symposium on Modelling, Analysis & Simulation of Computer and Telecommunication Systems.

[192]  Yoongu Kim,et al.  Architectural Techniques to Enhance DRAM Scaling , 2018 .

[193]  Onur Mutlu,et al.  Phase change memory architecture and the quest for scalability , 2010, Commun. ACM.

[194]  Onur Mutlu,et al.  Data retention in MLC NAND flash memory: Characterization, optimization, and recovery , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[195]  Jie Liu,et al.  Characterizing Application Memory Error Vulnerability to Optimize Datacenter Cost via Heterogeneous-Reliability Memory , 2014, 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[196]  Yan Solihin,et al.  CHOP: Adaptive filter-based DRAM caching for CMP server platforms , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[197]  Gabriel H. Loh,et al.  Fundamental Latency Trade-off in Architecting DRAM Caches: Outperforming Impractical SRAM-Tags with a Simple and Practical Design , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[198]  Y. Iwata,et al.  Bit Cost Scalable Technology with Punch and Plug Process for Ultra High Density Flash Memory , 2007, 2007 IEEE Symposium on VLSI Technology.

[199]  Qiang Wu,et al.  Revisiting Memory Errors in Large-Scale Production Data Centers: Analysis and Modeling of New Trends from the Field , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[200]  R. Fowler,et al.  Electron Emission in Intense Electric Fields , 1928 .

[201]  Chris Fallin,et al.  Flipping bits in memory without accessing them: An experimental study of DRAM disturbance errors , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[202]  Longzhe Han,et al.  CATA: A Garbage Collection Scheme for Flash Memory File Systems , 2006, UIC.

[203]  Jihong Kim,et al.  A read-disturb management technique for high-density NAND flash memory , 2013, APSys.

[204]  Tei-Wei Kuo,et al.  Garbage collection and wear leveling for flash memory: Past and future , 2014, 2014 International Conference on Smart Computing.

[205]  Onur Mutlu,et al.  Architecting phase change memory as a scalable dram alternative , 2009, ISCA '09.

[206]  Robert H. Dennard,et al.  Challenges and future directions for the scaling of dynamic random-access memory (DRAM) , 2002, IBM J. Res. Dev..

[207]  Tong Zhang,et al.  DiffECC: Improving SSD Read Performance Using Differentiated Error Correction Coding Schemes , 2010, 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[208]  Onur Mutlu,et al.  HeatWatch: Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature Awareness , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[209]  Robert G. Gallager,et al.  Low-density parity-check codes , 1962, IRE Trans. Inf. Theory.

[210]  Stefan Mangard,et al.  Rowhammer.js: A Remote Software-Induced Fault Attack in JavaScript , 2015, DIMVA.

[211]  Haralampos Pozidis,et al.  Multilevel-Cell Phase-Change Memory: A Viable Technology , 2016, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[212]  Vidyabhushan Mohan,et al.  MODELING THE PHYSICAL CHARACTERISTICS OF NAND FLASH MEMORY , 2010 .

[213]  Onur Mutlu,et al.  SoftMC: A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[214]  J. Kessenich,et al.  Bit error rate in NAND Flash memories , 2008, 2008 IEEE International Reliability Physics Symposium.

[215]  Rachata Ausavarungnirun,et al.  Row buffer locality aware caching policies for hybrid memories , 2012, 2012 IEEE 30th International Conference on Computer Design (ICCD).

[216]  Arif Merchant,et al.  Flash Reliability in Production: The Expected and the Unexpected , 2016, FAST.

[217]  Radford M. Neal,et al.  Near Shannon limit performance of low density parity check codes , 1996 .

[218]  Li-Pin Chang,et al.  On efficient wear leveling for large-scale flash-memory storage systems , 2007, SAC '07.

[219]  Onur Mutlu,et al.  ChargeCache: Reducing DRAM latency by exploiting row access locality , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[220]  Onur Mutlu,et al.  Low-Cost Inter-Linked Subarrays (LISA): Enabling fast inter-subarray data movement in DRAM , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[221]  M. Momodomi,et al.  New ultra high density EPROM and flash EEPROM with NAND structure cell , 1987, 1987 International Electron Devices Meeting.

[222]  Piero Olivo,et al.  Flash memory cells-an overview , 1997, Proc. IEEE.

[223]  Onur Mutlu,et al.  A case for exploiting subarray-level parallelism (SALP) in DRAM , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[224]  Lizy Kurian John,et al.  ESKIMO - energy savings using semantic knowledge of inconsequential memory occupancy for DRAM subsystem , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[225]  Moinuddin K. Qureshi,et al.  XED: Exposing On-Die Error Detection Information for Strong Memory Reliability , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[226]  Yoon-Hee Choi,et al.  Three-Dimensional 128 Gb MLC Vertical nand Flash Memory With 24-WL Stacked Layers and 50 MB/s High-Speed Programming , 2014, IEEE Journal of Solid-State Circuits.

[227]  Chung Lam,et al.  7.3 A resistance-drift compensation scheme to reduce MLC PCM raw BER by over 100× for storage-class memory applications , 2016, 2016 IEEE International Solid-State Circuits Conference (ISSCC).

[228]  T. Hamamoto,et al.  On the retention time distribution of dynamic random access memory (DRAM) , 1998 .