Multi-Rack Regenerating Codes for Hierarchical Distributed Storage Systems

Erasure codes provide higher reliability than replication for a same level of redundancy to store data in distributed storage systems, yet with more bandwidth overhead. Recently, regenerating codes are introduced, which significantly reduce the repair bandwidth by analyzing the fundamental tradeoff between storage capacity and repair bandwidth via the information flow graph. In reality, distributed storage systems with hierarchical structures are more common in data centers where data are organized in racks, and the cross-rack communication is more costly than the in-rack communication. Hence, in this paper, we introduce a class of codes to repair a failed node by downloading data from nodes in the same rack only, which are termed as multi-rack regenerating codes (MRC). Different with existing works, the cross-rack repair bandwidth under our codes can be reduced to zero. Meanwhile, we obtain the optimal tradeoff between storage and bandwidth of MRC, and present an explicit construction of MRC with the common product-matrix framework.

[1]  Amin Vahdat,et al.  Scale-Out Networking in the Data Center , 2010, IEEE Micro.

[2]  GhemawatSanjay,et al.  The Google file system , 2003 .

[3]  Chi Wan Sung,et al.  A code design framework for multi-rack distributed storage , 2014, 2014 IEEE Information Theory Workshop (ITW 2014).

[4]  Minghua Chen,et al.  BASIC Codes: Low-Complexity Regenerating Codes for Distributed Storage Systems , 2016, IEEE Transactions on Information Theory.

[5]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[6]  Cheng Huang,et al.  Erasure Coding in Windows Azure Storage , 2012, USENIX Annual Technical Conference.

[7]  Anand Raghunathan,et al.  ShuffleWatcher: Shuffle-aware Scheduling in Multi-tenant MapReduce Clusters , 2014, USENIX Annual Technical Conference.

[8]  Patrick P. C. Lee,et al.  Double Regenerating Codes for hierarchical data centers , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[9]  Chao Tian,et al.  Layered Exact-Repair Regenerating Codes via Embedded Error Correction and Block Designs , 2014, IEEE Transactions on Information Theory.

[10]  Kannan Ramchandran,et al.  Having Your Cake and Eating It Too: Jointly Optimal Erasure Codes for I/O, Storage, and Network-bandwidth , 2015, FAST.

[11]  Nihar B. Shah,et al.  Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction , 2010, IEEE Transactions on Information Theory.

[12]  Dimitris S. Papailiopoulos,et al.  XORing Elephants: Novel Erasure Codes for Big Data , 2013, Proc. VLDB Endow..

[13]  Cory Hill,et al.  f4: Facebook's Warm BLOB Storage System , 2014, OSDI.

[14]  Xinbing Wang,et al.  MAP: Multiauctioneer Progressive Auction for Dynamic Spectrum Access , 2011, IEEE Transactions on Mobile Computing.

[15]  Xinbing Wang,et al.  Delay and Capacity Tradeoff Analysis for MotionCast , 2011, IEEE/ACM Transactions on Networking.