Parallelism-Aware Locally Repairable Code for Distributed Storage Systems

Distributed storage systems store a substantial amount of data in a large number of servers built with commodity hardware. In order to protect data against server failures, erasure coding has been deployed in many distributed storage systems because of its low storage overhead. In particular, since disk I/O is, in many cases, a bottleneck in the distributed storage system, locally repairable codes, have been proposed that incur low volumes of disk I/O when reconstructing missing data after server failures. However, since original data can only be read from specific servers, existing designs of locally repairable codes suffer from limited data parallelism. Besides, if the performance of servers is heterogeneous, slow servers may become the bottleneck when accessing data in parallel. In this paper, we propose Galloper codes, a novel family of locally repairable codes, that achieve low disk I/O during reconstruction and meanwhile extend data parallelism from specific servers to all servers. Moreover, the amount of original data in each server can be arbitrarily determined based on the performance of corresponding servers. We have implemented a prototype of Galloper codes on Apache Hadoop, and our experimental results have shown that Galloper codes can reduce the completion time of MapReduce jobs by up to 42.9%, with a comparable performance as existing locally repairable codes, in terms of disk I/O overhead, as well as encoding and reconstruction overhead.

[1]  Jehoshua Bruck,et al.  X-Code: MDS Array Codes with Optimal Encoding , 1999, IEEE Trans. Inf. Theory.

[2]  Nihar B. Shah,et al.  Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction , 2010, IEEE Transactions on Information Theory.

[3]  Kenneth W. Shum,et al.  Functional-repair-by-transfer regenerating codes , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.

[4]  Dimitris S. Papailiopoulos,et al.  XORing Elephants: Novel Erasure Codes for Big Data , 2013, Proc. VLDB Endow..

[5]  Kannan Ramchandran,et al.  A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster , 2013, HotStorage.

[6]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[7]  Yunnan Wu,et al.  Network coding for distributed storage systems , 2010, IEEE Trans. Inf. Theory.

[8]  Kannan Ramchandran,et al.  Distributed Storage Codes With Repair-by-Transfer and Nonachievability of Interior Points on the Storage-Bandwidth Tradeoff , 2010, IEEE Transactions on Information Theory.

[9]  Xiaobo Zhou,et al.  Improving MapReduce performance in heterogeneous environments with adaptive task tuning , 2014, Middleware.

[10]  Randy H. Katz,et al.  Heterogeneity and dynamicity of clouds at scale: Google trace analysis , 2012, SoCC '12.

[11]  Mohammad Ali Maddah-Ali,et al.  Coded MapReduce , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[12]  Jehoshua Bruck,et al.  Zigzag Codes: MDS Array Codes With Optimal Rebuilding , 2011, IEEE Transactions on Information Theory.

[13]  Cheng Huang,et al.  Giza: Erasure Coding Objects across Global Data Centers , 2017, USENIX Annual Technical Conference.

[14]  Minghua Chen,et al.  Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems , 2007, Sixth IEEE International Symposium on Network Computing and Applications (NCA 2007).

[15]  Dimitris S. Papailiopoulos,et al.  Locally Repairable Codes , 2014, IEEE Trans. Inf. Theory.

[16]  Cheng Huang,et al.  Erasure Coding in Windows Azure Storage , 2012, USENIX Annual Technical Conference.

[17]  Mark S. Squillante,et al.  Failure data analysis of a large-scale heterogeneous server environment , 2004, International Conference on Dependable Systems and Networks, 2004.

[18]  Cheng Huang,et al.  On the Locality of Codeword Symbols , 2011, IEEE Transactions on Information Theory.

[19]  T. N. Vijaykumar,et al.  Tarazu: optimizing MapReduce on heterogeneous clusters , 2012, ASPLOS XVII.

[20]  F. Moore,et al.  Polynomial Codes Over Certain Finite Fields , 2017 .

[21]  Baochun Li,et al.  Zebra: Demand-aware erasure coding for distributed storage systems , 2016, 2016 IEEE/ACM 24th International Symposium on Quality of Service (IWQoS).

[22]  Randy H. Katz,et al.  Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.

[23]  Itzhak Tamo,et al.  A Family of Optimal Locally Recoverable Codes , 2013, IEEE Transactions on Information Theory.

[24]  王新,et al.  MDR Codes: A New Class of RAID-6 Codes with Optimal Rebuilding and Encoding , 2014 .

[25]  Jehoshua Bruck,et al.  Cyclic Lowest Density MDS Array Codes , 2009, IEEE Transactions on Information Theory.

[26]  Jiwu Shu,et al.  D-Code: An Efficient RAID-6 Code to Optimize I/O Loads and Read Performance , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[27]  Dimitris S. Papailiopoulos,et al.  Simple regenerating codes: Network coding for cloud storage , 2011, 2012 Proceedings IEEE INFOCOM.

[28]  Jiwu Shu,et al.  A Stack-Based Single Disk Failure Recovery Scheme for Erasure Coded Storage Systems , 2014, 2014 IEEE 33rd International Symposium on Reliable Distributed Systems.

[29]  Baochun Li,et al.  On Data Parallelism of Erasure Coding in Distributed Storage Systems , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[30]  Ju Wang,et al.  Windows Azure Storage: a highly available cloud storage service with strong consistency , 2011, SOSP.

[31]  Alexandros G. Dimakis,et al.  Network Coding for Distributed Storage Systems , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[32]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[33]  Kannan Ramchandran,et al.  Having Your Cake and Eating It Too: Jointly Optimal Erasure Codes for I/O, Storage, and Network-bandwidth , 2015, FAST.

[34]  Catherine D. Schuman,et al.  A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries for Storage , 2009, FAST.