Joint decoding of RAID-ECC solutions for SSDs

There are two independent layers of error correction mechanisms, i.e., error-correction codes (ECC) and redundant array of independent disks (RAID), to protect NAND-based Solid State Drives (SSDs) from serious noise/disturb. However, bit error rates of modern SSDs have increased to the point that more powerful ECCs or Maximal Distance Separate codes are only regarded as sufficient solutions. The theme of this work is to show that instead of designing new error correction mechanisms, it is possible to improve the NAND-based SSD reliability by designing a joint decoder exploring the inherent information between ECCs and RAIDs. We first show that it is possible to recover two-page failures using the inherent information with RAID 4/5; we then show that these ideas can be extended to RAID 6; finally, we conduct information theoretic study to explore how much information can be reliably stored with the help of RAID system in NAND-based SSD.

[1]  Rüdiger L. Urbanke,et al.  The capacity of low-density parity-check codes under message-passing decoding , 2001, IEEE Trans. Inf. Theory.

[2]  Mario Blaum,et al.  Partial-MDS Codes and Their Application to RAID Type of Architectures , 2012, IEEE Transactions on Information Theory.

[3]  Anxiao Jiang,et al.  A study of polar codes for MLC NAND flash memories , 2015, 2015 International Conference on Computing, Networking and Communications (ICNC).

[4]  Rami Cohen,et al.  LDPC codes for partial-erasure channels in multi-level memories , 2014, 2014 IEEE International Symposium on Information Theory.

[5]  Peter F. Corbett,et al.  Row-Diagonal Parity for Double Disk Failure Correction (Awarded Best Paper!) , 2004, USENIX Conference on File and Storage Technologies.

[6]  Qing Li WOM codes against inter-cell interference in NAND memories , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[7]  Sae-Young Chung,et al.  Analysis of sum-product decoding of low-density parity-check codes using a Gaussian approximation , 2001, IEEE Trans. Inf. Theory.

[8]  Qing Li Compressed Rank Modulation , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[9]  格雷戈里·伯德 Implementing RAID in solid state memory , 2010 .

[10]  Ramesh Pyndiah,et al.  Near-optimum decoding of product codes: block turbo codes , 1998, IEEE Trans. Commun..

[11]  Qing Li,et al.  Joint decoding of content-replication codes for flash memories , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[12]  Nanning Zheng,et al.  LDPC-in-SSD: making advanced error correction codes work effectively in solid state drives , 2013, FAST.

[13]  Qing Li,et al.  Polar codes are optimal for write-efficient memories , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[14]  Jongmoo Choi,et al.  Improving SSD reliability with RAID via Elastic Striping and Anywhere Parity , 2013, 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[15]  Abbas El Gamal,et al.  Capacity theorems for the relay channel , 1979, IEEE Trans. Inf. Theory.

[16]  Jehoshua Bruck,et al.  EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures , 1995, IEEE Trans. Computers.

[17]  Qing Li,et al.  Noise modeling and capacity analysis for NAND flash memories , 2014, 2014 IEEE International Symposium on Information Theory.

[18]  Dongkun Shin,et al.  Flash-Aware RAID Techniques for Dependable and High-Performance Flash Memory SSD , 2011, IEEE Transactions on Computers.

[19]  Evangelos Eleftheriou,et al.  Regular and irregular progressive edge-growth tanner graphs , 2005, IEEE Transactions on Information Theory.

[20]  Jehoshua Bruck,et al.  X-Code: MDS Array Codes with Optimal Encoding , 1999, IEEE Trans. Inf. Theory.

[21]  Thomas E. Fuja,et al.  Bilayer Low-Density Parity-Check Codes for Decode-and-Forward in Relay Channels , 2006, IEEE Transactions on Information Theory.

[22]  Paul H. Siegel,et al.  Performance of Multilevel Flash Memories With Different Binary Labelings: A Multi-User Perspective , 2016, IEEE Journal on Selected Areas in Communications.

[23]  Qing Li,et al.  Coding for secure write-efficient memories , 2014, 2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[24]  Onur Mutlu,et al.  Error patterns in MLC NAND flash memory: Measurement, characterization, and analysis , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).