LT Codes For Efficient and Reliable Distributed Storage Systems Revisited

LT codes and digital fountain techniques have received significant attention from both academics and industry in the past few years. There have also been extensive interests in applying LT code techniques to distributed storage systems such as cloud data storage in recent years. However, Plank and Thomason's experimental results show that LDPC code performs well only asymptotically when the number of data fragments increases and it has the worst performance for small number of data fragments (e.g., less than 100). In their INFOCOM 2012 paper, Cao, Yu, Yang, Lou, and Hou proposed to use exhaustive search approach to find a deterministic LT code that could be used to decode the original data content correctly in distributed storage systems. However, by Plank and Thomason's experimental results, it is not clear whether the exhaustive search approach will work efficiently or even correctly. This paper carries out the theoretical analysis on the feasibility and performance issues for applying LT codes to distributed storage systems. By employing the underlying ideas of efficient Belief Propagation (BP) decoding process in LT codes, this paper introduces two classes of codes called flat BP-XOR codes and array BP-XOR codes (which can be considered as a deterministic version of LT codes). We will show the equivalence between the edge-colored graph model and degree-one-and-two encoding symbols based array BP-XOR codes. Using this equivalence result, we are able to design general array BP-XOR codes using graph based results. Similarly, based on this equivalence result, we are able to get new results for edge-colored graph models using results from array BP-XOR codes.

[1]  Robert G. Gallager,et al.  Low-density parity-check codes , 1962, IRE Trans. Inf. Theory.

[2]  Tom Verhoeff,et al.  An updated table of minimum-distance bounds for binary linear codes , 1987, IEEE Trans. Inf. Theory.

[3]  Zhenyu Yang,et al.  LT codes-based secure and reliable cloud storage service , 2012, 2012 Proceedings IEEE INFOCOM.

[4]  Peter F. Corbett,et al.  Row-Diagonal Parity for Double Disk Failure Correction (Awarded Best Paper!) , 2004, USENIX Conference on File and Storage Technologies.

[5]  Jehoshua Bruck,et al.  Low density MDS codes and factors of complete graphs , 1998, Proceedings. 1998 IEEE International Symposium on Information Theory (Cat. No.98CH36252).

[6]  Mario Blaum,et al.  New array codes for multiple phased burst correction , 1993, IEEE Trans. Inf. Theory.

[7]  Alexander Vardy,et al.  MDS array codes with independent parity symbols , 1995, Proceedings of 1995 IEEE International Symposium on Information Theory.

[8]  Yunnan Wu,et al.  A Survey on Network Codes for Distributed Storage , 2010, Proceedings of the IEEE.

[9]  Muriel Medard,et al.  How good is random linear coding based distributed networked storage , 2005 .

[10]  Yongge Wang,et al.  Edge-colored graphs with applications to homogeneous faults , 2011, Inf. Process. Lett..

[11]  Midori Kobayashi On perfect one-factorization of the complete graphK2p , 1989, Graphs Comb..

[12]  P. A. Wintz,et al.  Error Free Coding , 1973 .

[13]  Xiaozhou Li,et al.  Flat XOR-based erasure codes in storage systems: Constructions, efficient recovery, and tradeoffs , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[14]  Michael Luby,et al.  LT codes , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[15]  James S. Plank,et al.  A practical analysis of low-density parity-check erasure codes for wide-area storage applications , 2004, International Conference on Dependable Systems and Networks, 2004.

[16]  John Kubiatowicz,et al.  Erasure Coding Vs. Replication: A Quantitative Comparison , 2002, IPTPS.

[17]  Cheng Huang,et al.  STAR : An Efficient Coding Scheme for Correcting Triple Storage Node Failures , 2005, IEEE Transactions on Computers.

[18]  Mario Blaum,et al.  On Lowest Density MDS Codes , 1999, IEEE Trans. Inf. Theory.

[19]  Alexandros G. Dimakis,et al.  Network Coding for Distributed Storage Systems , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[20]  Vinod M. Prabhakaran,et al.  Decentralized erasure codes for distributed networked storage , 2006, IEEE Transactions on Information Theory.

[21]  Daniel A. Spielman,et al.  Practical loss-resilient codes , 1997, STOC '97.

[22]  Jehoshua Bruck,et al.  Cyclic Lowest Density MDS Array Codes , 2009, IEEE Transactions on Information Theory.

[23]  Hugo Krawczyk Distributed fingerprints and secure information dispersal , 1993, PODC '93.

[24]  Jehoshua Bruck,et al.  EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures , 1995, IEEE Trans. Computers.

[25]  Jehoshua Bruck,et al.  X-Code: MDS Array Codes with Optimal Encoding , 1999, IEEE Trans. Inf. Theory.

[26]  Michael O. Rabin,et al.  Efficient dispersal of information for security, load balancing, and fault tolerance , 1989, JACM.

[27]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[28]  B. A. Anderson Symmetry groups of some perfect 1-factorizations of complete graphs , 1977, Discret. Math..