Erasure-Coding-Based Storage and Recovery for Distributed Exascale Storage Systems

Various techniques have been used in distributed file systems for data availability and stability. Typically, a method for storing data in a replication technique-based distributed file system is used, but due to the problem of space efficiency, an erasure-coding (EC) technique has been utilized more recently. The EC technique improves the space efficiency problem more than the replication technique does. However, the EC technique has various performance degradation factors, such as encoding and decoding and input and output (I/O) degradation. Thus, this study proposes a buffering and combining technique in which various I/O requests that occurred during encoding in an EC-based distributed file system are combined into one and processed. In addition, it proposes four recovery measures (disk input/output load distribution, random block layout, multi-thread-based parallel recovery, and matrix recycle technique) to distribute the disk input/output loads generated during decoding.

[1]  Minghua Chen,et al.  Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems , 2007, Sixth IEEE International Symposium on Network Computing and Applications (NCA 2007).

[2]  John Cook,et al.  Comparing cost and performance of replication and erasure coding , 2013, ArXiv.

[3]  Baochun Li,et al.  Erasure coding for cloud storage systems: A survey , 2013 .

[4]  Yunnan Wu,et al.  Network coding for distributed storage systems , 2010, IEEE Trans. Inf. Theory.

[5]  Jeong-Joon Kim,et al.  Cost analysis of erasure coding for exa-scale storage , 2018, The Journal of Supercomputing.

[6]  Dae-Wha Seo,et al.  Torus Network Based Distributed Storage System for Massive Multimedia Contents , 2016 .

[7]  Dimitris S. Papailiopoulos,et al.  XORing Elephants: Novel Erasure Codes for Big Data , 2013, Proc. VLDB Endow..

[8]  I. Reed,et al.  Polynomial Codes Over Certain Finite Fields , 1960 .

[9]  Sriram Rao,et al.  A The Quantcast File System , 2013, Proc. VLDB Endow..

[10]  André Brinkmann,et al.  Random Slicing: Efficient and Scalable Data Placement for Large-Scale Storage Systems , 2014, TOS.

[11]  Shouling Ji,et al.  Sapprox: Enabling Efficient and Accurate Approximations on Sub-datasets with Distribution-aware Online Sampling , 2016, Proc. VLDB Endow..

[12]  Yanpei Chen,et al.  Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads , 2012, Proc. VLDB Endow..

[13]  Julian M. Kunkel,et al.  Exascale Storage Systems - An Analytical Study of Expenses , 2014, Supercomput. Front. Innov..

[14]  Jeong-Joon Kim,et al.  Efficient techniques of parallel recovery for erasure-coding-based distributed file systems , 2019, Computing.

[15]  J. Plank Erasure Codes for Storage Systems , 2013 .