An efficient XOR-scheduling algorithm for erasure codes encoding

In large storage systems, it is crucial to protect data from loss due to failures. Erasure codes lay the foundation of this protection, enabling systems to reconstruct lost data when components fail. Erasure codes can however impose significant performance overhead in two core operations: encoding, where coding information is calculated from newly written data, and decoding, where data is reconstructed after failures. This paper focuses on improving the performance of encoding, the more frequent operation. It does so by scheduling the operations of XOR-based erasure codes to optimize their use of cache memory. We call the technique XORscheduling and demonstrate how it applies to a wide variety of existing erasure codes. We conduct a performance evaluation of scheduling these codes on a variety of processors and show that XOR-scheduling significantly improves upon the traditional approach. Hence, we believe that XORscheduling has great potential to have wide impact in practical storage systems.

[1]  James Lee Hafner,et al.  WEAVER codes: highly fault tolerant erasure codes for storage systems , 2005, FAST'05.

[2]  Jehoshua Bruck,et al.  X-Code: MDS Array Codes with Optimal Encoding , 1999, IEEE Trans. Inf. Theory.

[3]  Cheng Huang,et al.  STAR : An Efficient Coding Scheme for Correcting Triple Storage Node Failures , 2005, IEEE Transactions on Computers.

[4]  Markus Kowarschik,et al.  An Overview of Cache Optimization Techniques and Cache-Aware Numerical Algorithms , 2002, Algorithms for Memory Hierarchies.

[5]  Lihao Xu,et al.  Hydra: a platform for survivable and secure data storage systems , 2005, StorageSS '05.

[6]  David A. Wood,et al.  Cache profiling and the SPEC benchmarks: a case study , 1994, Computer.

[7]  Lihao Xu,et al.  Optimizing Cauchy Reed-Solomon Codes for Fault-Tolerant Network Storage Applications , 2006, Fifth IEEE International Symposium on Network Computing and Applications (NCA'06).

[8]  Mario Blaum,et al.  New array codes for multiple phased burst correction , 1993, IEEE Trans. Inf. Theory.

[9]  James S. Plank The RAID-6 Liberation Codes , 2008, FAST.

[10]  Allen,et al.  Optimizing Compilers for Modern Architectures , 2004 .

[11]  Howard Gobioff,et al.  The Google file system , 2003, SOSP '03.

[12]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[13]  Jehoshua Bruck,et al.  Low density MDS codes and factors of complete graphs , 1998, Proceedings. 1998 IEEE International Symposium on Information Theory (Cat. No.98CH36252).

[14]  Jehoshua Bruck,et al.  EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures , 1995, IEEE Trans. Computers.

[15]  Peter F. Corbett,et al.  Row-Diagonal Parity for Double Disk Failure Correction (Awarded Best Paper!) , 2004, USENIX Conference on File and Storage Technologies.

[16]  James Lee Hafner,et al.  Matrix methods for lost data reconstruction in erasure codes , 2005, FAST'05.

[17]  Minghua Chen,et al.  On Optimizing XOR-Based Codes for Fault-Tolerant Storage Applications , 2007, 2007 IEEE Information Theory Workshop.

[18]  James S. Plank,et al.  The Raid-6 Liber8Tion Code , 2009, Int. J. High Perform. Comput. Appl..

[19]  Jehoshua Bruck,et al.  Computing in the RAIN: A Reliable Array of Independent Nodes , 2000, IPDPS Workshops.

[20]  Marek Karpinski,et al.  An XOR-based erasure-resilient coding scheme , 1995 .

[21]  Ben Y. Zhao,et al.  OceanStore: an architecture for global-scale persistent storage , 2000, SIGP.

[22]  Catherine D. Schuman,et al.  A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries for Storage , 2009, FAST.

[23]  Mario Blaum,et al.  On Lowest Density MDS Codes , 1999, IEEE Trans. Inf. Theory.

[24]  Nikolai Joukov,et al.  RAIF: Redundant Array of Independent Filesystems , 2007, 24th IEEE Conference on Mass Storage Systems and Technologies (MSST 2007).

[25]  Jay J. Wylie,et al.  Determining Fault Tolerance of XOR-Based Erasure Codes Efficiently , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[26]  Joe D. Warren,et al.  A hierarchical basis for reordering transformations , 1984, POPL '84.