STORE: Data recovery with approximate minimum network bandwidth and disk I/O in distributed storage systems

Recently, traditional erasure codes such as Reed-Solomon (RS) codes have been increasingly deployed in many distributed storage systems to reduce the large storage overhead incurred by the widely adopted replication scheme. However, these codes require significantly high resources with respect to network bandwidth and disk I/O during recovery of missing or unavailable data. It is referred as the recovery problem. In this paper, we dedicate to integrating exact minimum bandwidth regenerating codes into practical systems to solve the recovery problem. We design an implementation friendly storage code with the recently proposed BASIC framework and ZigZag decodable code for saving recovery bandwidth and disk I/O. We build a system called STORE based on this code and evaluate our prototype atop a HDFS cluster testbed with 21 nodes. As shown in this paper, the recovery bandwidth achieves minimum approximately during recovery of both data block and parity block with STORE. Another attractive result is that the recovery disk I/O also achieves minimum approximately during recovery of data block. Due to the reduction of recovery bandwidth and disk I/O, the degraded read throughput is boosted notably.

[1]  Kannan Ramchandran,et al.  Fractional repetition codes for repair in distributed storage systems , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[2]  Cheng Huang,et al.  Rethinking erasure codes for cloud file systems: minimizing I/O for recovery and degraded reads , 2012, FAST.

[3]  Anne-Marie Kermarrec,et al.  Regenerating Codes: A System Perspective , 2012, SRDS.

[4]  GhemawatSanjay,et al.  The Google file system , 2003 .

[5]  Garth A. Gibson,et al.  DiskReduce: RAID for data-intensive scalable computing , 2009, PDSW '09.

[6]  Minghua Chen,et al.  BASIC regenerating code: Binary addition and shift for exact repair , 2013, 2013 IEEE International Symposium on Information Theory.

[7]  Sriram Rao,et al.  A The Quantcast File System , 2013, Proc. VLDB Endow..

[8]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[9]  Alexandros G. Dimakis,et al.  Network Coding for Distributed Storage Systems , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[10]  Kenneth W. Shum,et al.  General self-repairing codes for distributed storage systems , 2013, 2013 IEEE International Conference on Communications (ICC).

[11]  Nihar B. Shah,et al.  Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction , 2010, IEEE Transactions on Information Theory.

[12]  Alekh Jindal,et al.  Hadoop++ , 2010 .

[13]  Kannan Ramchandran,et al.  A "hitchhiker's" guide to fast and efficient data reconstruction in erasure-coded data centers , 2015, SIGCOMM 2015.

[14]  Kenneth W. Shum,et al.  Construction of exact-BASIC codes for distributed storage systems at the MSR point , 2013, 2013 IEEE International Conference on Big Data.

[15]  Kannan Ramchandran,et al.  Explicit construction of optimal exact regenerating codes for distributed storage , 2009, 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[16]  Dimitris S. Papailiopoulos,et al.  XORing Elephants: Novel Erasure Codes for Big Data , 2013, Proc. VLDB Endow..

[17]  Kannan Ramchandran,et al.  A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster , 2013, HotStorage.

[18]  Jian Lin,et al.  CORE: Augmenting regenerating-coding-based recovery for single and concurrent failures in distributed storage systems , 2013, 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST).

[19]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[20]  Chi Wan Sung,et al.  A ZigZag-decodable code with the MDS property for distributed storage systems , 2013, 2013 IEEE International Symposium on Information Theory.

[21]  Kenneth W. Shum,et al.  Replication-based distributed storage systems with variable repetition degrees , 2014, 2014 Twentieth National Conference on Communications (NCC).

[22]  Cheng Huang,et al.  In Search of I/O-Optimal Recovery from Disk Failures , 2011, HotStorage.

[23]  Huayu Zhang,et al.  Minimum storage BASIC codes: A system perspective , 2013, 2013 IEEE International Conference on Big Data.

[24]  Kenneth W. Shum,et al.  General Fractional Repetition Codes for Distributed Storage Systems , 2014, IEEE Communications Letters.

[25]  Dimitris S. Papailiopoulos,et al.  Simple regenerating codes: Network coding for cloud storage , 2011, 2012 Proceedings IEEE INFOCOM.

[26]  Patrick P. C. Lee,et al.  On the speedup of single-disk failure recovery in XOR-coded storage systems: Theory and practice , 2012, 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST).

[27]  Minghua Chen,et al.  Regenerating codes over a binary cyclic code , 2014, 2014 IEEE International Symposium on Information Theory.