A new Zigzag MDS code with optimal encoding and efficient decoding

Distributed file system has emerged in recent years as an efficient solution to store the large amount of data produced anytime and anywhere. In order to guarantee data reliability, it is necessary to introduce redundancy to the storage systems. Compared to simple replication, practical systems are increasingly adopting erasure codes for better storage efficiency. However, traditional erasure codes such as maximum-distance-separable (MDS) codes, are designed over a large finite field, which inevitably hinders the wide implementation of erasure codes. In this paper, we propose a new family of MDS codes with high computation efficiency. More specifically, only XOR operation is included in the encoding process to generate parity blocks. Upon failure of a storage node, we use the efficient Zigzag decoding method to recover the failed blocks, which achieves the optimal encoding and an efficient decoding. Furthermore, we implement the proposed codes in a distributed file system, and the results show the high performance of the new codes.

[1]  Cheng Huang,et al.  STAR : An Efficient Coding Scheme for Correcting Triple Storage Node Failures , 2005, IEEE Transactions on Computers.

[2]  Minghua Chen,et al.  New MDS array code correcting multiple disk failures , 2014, 2014 IEEE Global Communications Conference.

[3]  Jehoshua Bruck,et al.  EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures , 1995, IEEE Trans. Computers.

[4]  F. MacWilliams,et al.  The Theory of Error-Correcting Codes , 1977 .

[5]  Marek Karpinski,et al.  An XOR-based erasure-resilient coding scheme , 1995 .

[6]  Peter F. Corbett,et al.  Row-Diagonal Parity for Double Disk Failure Correction (Awarded Best Paper!) , 2004, USENIX Conference on File and Storage Technologies.

[7]  Andrea C. Arpaci-Dusseau,et al.  An analysis of data corruption in the storage stack , 2008, TOS.

[8]  Alexandros G. Dimakis,et al.  Network Coding for Distributed Storage Systems , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[9]  Peter Lancaster,et al.  The theory of matrices , 1969 .

[10]  Cliff Lampe,et al.  The Benefits of Facebook "Friends: " Social Capital and College Students' Use of Online Social Network Sites , 2007, J. Comput. Mediat. Commun..

[11]  Chi Wan Sung,et al.  A ZigZag-decodable code with the MDS property for distributed storage systems , 2013, 2013 IEEE International Symposium on Information Theory.

[12]  Minghua Chen,et al.  BASIC regenerating code: Binary addition and shift for exact repair , 2013, 2013 IEEE International Symposium on Information Theory.

[13]  Jehoshua Bruck,et al.  Zigzag Codes: MDS Array Codes With Optimal Rebuilding , 2011, IEEE Transactions on Information Theory.

[14]  GhemawatSanjay,et al.  The Google file system , 2003 .

[15]  F. R. Gantmakher The Theory of Matrices , 1984 .

[16]  Mario Blaum,et al.  New array codes for multiple phased burst correction , 1993, IEEE Trans. Inf. Theory.