Rack-Aware Regenerating Codes with Fewer Helper Racks

We consider the rack-aware storage system where n nodes are organized in n̄ racks each containing u nodes, and any k nodes can retrieve the stored file. Moreover, any single node erasure can be recovered by downloading data from d̄ helper racks as well as the remaining u−1 nodes in the same rack. Previous work mostly focuses on minimizing the cross-rack repair bandwidth under the condition d̄ ≥ k̄, where k̄ = b k u c. However, d̄ ≥ k̄ is not an intrinsic condition for the rack-aware storage model. In this paper, we establish a tradeoff between the storage overhead and cross-rack repair bandwidth for the particularly interesting case d̄ < k̄. Furthermore, we present explicit constructions of codes with parameters lying on the tradeoff curve respectively at the minimum storage point and minimum bandwidth point. The codes are scalar or have subpacketization d̄, and operate over finite fields of size comparable to n. Regarding d̄ as the repair degree, these codes combine the advantage of regenerating codes in minimizing the repair bandwidth and that of locally repairable codes in reducing the repair degree. Moreover, they also abandon the restriction of MBR codes having storage overhead no less than 2× and that of high-rate MSR codes having exponential sub-packetization level.

[1]  Kenneth W. Shum,et al.  Rack-Aware Regenerating Codes for Data Centers , 2019, IEEE Transactions on Information Theory.

[2]  Alexandros G. Dimakis,et al.  Network Coding for Distributed Storage Systems , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[3]  Jaekyun Moon,et al.  Capacity of clustered distributed storage , 2016, 2017 IEEE International Conference on Communications (ICC).

[4]  Cheng Huang,et al.  On the Locality of Codeword Symbols , 2011, IEEE Transactions on Information Theory.

[5]  Alexander Barg,et al.  Explicit Constructions of High-Rate MDS Array Codes With Optimal Repair Bandwidth , 2016, IEEE Transactions on Information Theory.

[6]  Hanxu Hou,et al.  Minimum Storage Rack-Aware Regenerating Codes with Exact Repair and Small Sub-Packetization , 2020, 2020 IEEE International Symposium on Information Theory (ISIT).

[7]  Muriel Médard,et al.  The Storage Versus Repair-Bandwidth Trade-off for Clustered Storage Systems , 2018, IEEE Transactions on Information Theory.

[8]  Raymond W. Yeung,et al.  Information Theory and Network Coding , 2008 .

[9]  P. Vijay Kumar,et al.  Codes With Local Regeneration and Erasure Correction , 2014, IEEE Transactions on Information Theory.

[10]  Alexander Barg,et al.  Explicit Constructions of MSR Codes for Clustered Distributed Storage: The Rack-Aware Storage Model , 2020, IEEE Transactions on Information Theory.

[11]  Dimitris S. Papailiopoulos,et al.  XORing Elephants: Novel Erasure Codes for Big Data , 2013, Proc. VLDB Endow..

[12]  Itzhak Tamo,et al.  A Family of Optimal Locally Recoverable Codes , 2013, IEEE Transactions on Information Theory.

[13]  Nihar B. Shah,et al.  Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction , 2010, IEEE Transactions on Information Theory.

[14]  Patrick P. C. Lee,et al.  Double Regenerating Codes for hierarchical data centers , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[15]  Balaji Srinivasan Babu,et al.  Erasure coding for distributed storage: an overview , 2018, Science China Information Sciences.