Error source and latency-aware read performance optimization scheme for aged SSDs

LDPC code has been used widely in NAND flash-based storage system due to its high error correction capacity, prolonging the lifetime of multi-bit NAND flash. However, the LDPC decoding latency degrades the read performance of SSD as it induces more read-retry operations. The last RL(Read-Level) recording method has been proposed in recent research works, which achieves better performance improvement by reducing many useless fail reads. However, these schemes reset the RL of these pages to be 1 after these blocks are erased. Using RL 1 to read these pages may induce many fail reads at first read on each page. That because it ignores the different error source issues, i.e., a part of the page error comes from the P/E cycles, while others come from retention time and other sources. Motivated by this observation, in this paper, we propose two schemes to optimize the read procedure of NAND flash-based SSD, especially for aged SSDs. We propose to record RL induced by different error sources separately, so the RL of the page could keep unchanged rather than 1 after the blocks are erased. The scheme could reduces useless fail read after the blocks are read at first time. We also design a latency aware I/O scheduler to reorder the input read requests in batch by prioritizing requests with low latency to reduce the queue latency. Our experiments show that the proposed scheme can reduce the average response time by up to 33% with less storage overhead. key words: NAND flash, read retries, LDPC, retention time, I/O scheduler Classification: XYZ (choose one from Table II)

[1]  Edwin Hsing-Mean Sha,et al.  Exploiting parallelism in I/O scheduling for access conflict minimization in flash-based solid state drives , 2014, 2014 30th Symposium on Mass Storage Systems and Technologies (MSST).

[2]  Shuhei Tanakamaru,et al.  Advanced error prediction LDPC for high-speed reliable TLC nand-based SSDs , 2014, 2014 IEEE 6th International Memory Workshop (IMW).

[3]  Mohammad Arjomand,et al.  Exploiting Intra-Request Slack to Improve SSD Performance , 2017, ASPLOS.

[4]  Onur Mutlu,et al.  Threshold voltage distribution in MLC NAND flash memory: Characterization, analysis, and modeling , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[5]  Liang Shi,et al.  Exploiting Process Variation for Write Performance Improvement on NAND Flash Memory Storage Systems , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[6]  Jun Yang,et al.  DLV: Exploiting Device Level Latency Variations for Performance Improvement on Flash Memory Storage Systems , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[7]  Mahmut T. Kandemir,et al.  Sprinkler: Maximizing resource utilization in many-chip solid state disks , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[8]  Tong Zhang,et al.  On the Use of Soft-Decision Error-Correction Codes in nand Flash Memory , 2011, IEEE Transactions on Circuits and Systems I: Regular Papers.

[9]  Kai Liu,et al.  Boosting the Performance of SSDs via Fully Exploiting the Plane Level Parallelism , 2020, IEEE Transactions on Parallel and Distributed Systems.

[10]  Neal R. Mielke,et al.  Reliability of Solid-State Drives Based on NAND Flash Memory , 2017, Proceedings of the IEEE.

[11]  Zhonghai Lu,et al.  Characterizing the Reliability and Threshold Voltage Shifting of 3D Charge Trap NAND Flash , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[12]  Hai Jin,et al.  LaLDPC: Latency-aware LDPC for Read Performance Improvement of Solid State Drives , 2017 .

[13]  Tong Zhang,et al.  Exploiting Memory Device Wear-Out Dynamics to Improve NAND Flash Memory System Performance , 2011, FAST.

[14]  Onur Mutlu,et al.  HeatWatch: Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature Awareness , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[15]  Myoungjun Chun,et al.  Exploiting Process Similarity of 3D Flash Memory for High Performance SSDs , 2019, MICRO.

[16]  Chihiro Matsui,et al.  System Performance Comparison of 3D Charge-Trap TLC NAND Flash and 2D Floating-Gate MLC NAND Flash Based SSDs , 2020, IEICE Trans. Electron..

[17]  Ken Takeuchi,et al.  Analysis on Heterogeneous SSD Configuration with Quadruple-Level Cell (QLC) NAND Flash Memory , 2019, 2019 IEEE 11th International Memory Workshop (IMW).

[18]  Ken Takeuchi,et al.  Automatic Data Repair Overwrite Pulse for 3D-TLC NAND Flash Memories with 38x Data-Retention Lifetime Extension , 2019, 2019 IEEE International Reliability Physics Symposium (IRPS).

[19]  Fei Wu,et al.  Characterizing 3D Charge Trap NAND Flash: Observations, Analyses and Applications , 2018, 2018 IEEE 36th International Conference on Computer Design (ICCD).

[20]  Youngjae Kim,et al.  DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings , 2009, ASPLOS.

[21]  Xu Li,et al.  A 512Gb 3b/Cell 3D flash memory on a 96-word-line-layer technology , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[22]  Antony I. T. Rowstron,et al.  Migrating server storage to SSDs: analysis of tradeoffs , 2009, EuroSys '09.

[23]  Onur Mutlu,et al.  Improving 3D NAND Flash Memory Lifetime by Tolerating Early Retention Loss and Process Variation , 2018, SIGMETRICS.

[24]  Suzhen Wu,et al.  Improving the SSD Performance by Exploiting Request Characteristics and Internal Parallelism , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[25]  Onur Mutlu,et al.  Experimental Characterization, Optimization, and Recovery of Data Retention Errors in MLC NAND Flash Memory , 2018, ArXiv.

[26]  Nanning Zheng,et al.  LDPC-in-SSD: making advanced error correction codes work effectively in solid state drives , 2013, FAST.

[27]  Onur Mutlu,et al.  Error patterns in MLC NAND flash memory: Measurement, characterization, and analysis , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[28]  Hong Jiang,et al.  Performance impact and interplay of SSD parallelism through advanced commands, allocation strategy and data granularity , 2011, ICS '11.

[29]  Antony I. T. Rowstron,et al.  Write off-loading: Practical power management for enterprise storage , 2008, TOS.

[30]  Sang Lyul Min,et al.  Design Tradeoffs for SSD Reliability , 2019, FAST.