On the minimum storage overhead of distributed storage codes with a given repair locality

The repair locality of a storage code is the maximum number of nodes that may be contacted during the repair of a failed node. Having small repair locality is desirable since it is proportional to the number of disk accesses required during a node repair, which for certain applications seems to be the main bottleneck. However, recent publications show that small repair locality comes with a penalty in terms of code distance or storage overhead, at least if exact repair is required. Here, we first review some of the recent work on possible (information-theoretical) trade-offs between repair locality and other code parameters like storage overhead (or, equivalently, coding rate) and code distance, which all assume the exact repair regime. Then, we present some new information theoretical lower bounds on the storage overhead as a function of the repair locality, valid for most common coding and repair models.

[1]  Sriram Vishwanath,et al.  On locality in distributed storage systems , 2012, 2012 IEEE Information Theory Workshop.

[2]  Cheng Huang,et al.  On the Locality of Codeword Symbols , 2011, IEEE Transactions on Information Theory.

[3]  Cheng Huang,et al.  In Search of I/O-Optimal Recovery from Disk Failures , 2011, HotStorage.

[4]  Dimitris S. Papailiopoulos,et al.  Simple regenerating codes: Network coding for cloud storage , 2011, 2012 Proceedings IEEE INFOCOM.

[5]  D. West Introduction to Graph Theory , 1995 .

[6]  Zheng Shao,et al.  Data warehousing and analytics infrastructure at facebook , 2010, SIGMOD Conference.

[7]  Wencin Poh,et al.  Characterizations and construction methods for linear functional-repair storage codes , 2013, 2013 IEEE International Symposium on Information Theory.

[8]  Frédérique E. Oggier,et al.  Self-Repairing Codes for distributed storage — A projective geometric construction , 2011, 2011 IEEE Information Theory Workshop.

[9]  Henk D. L. Hollmann Storage codes — Coding rate and repair locality , 2013, 2013 International Conference on Computing, Networking and Communications (ICNC).

[10]  P. Vijay Kumar,et al.  Optimal linear codes with a local-error-correction property , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.

[11]  Frédérique Oggier,et al.  Self-repairing homomorphic codes for distributed storage systems , 2010, 2011 Proceedings IEEE INFOCOM.

[12]  Minghua Chen,et al.  Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems , 2007, Sixth IEEE International Symposium on Network Computing and Applications (NCA 2007).

[13]  Kannan Ramchandran,et al.  Distributed Storage Codes With Repair-by-Transfer and Nonachievability of Interior Points on the Storage-Bandwidth Tradeoff , 2010, IEEE Transactions on Information Theory.

[14]  Alexandros G. Dimakis,et al.  Network Coding for Distributed Storage Systems , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[15]  Cheng Huang,et al.  Erasure Coding in Windows Azure Storage , 2012, USENIX Annual Technical Conference.

[16]  Gou Hosoya,et al.  国際会議参加報告:2014 IEEE International Symposium on Information Theory , 2014 .

[17]  P. Vijay Kumar,et al.  Codes with local regeneration , 2012, 2013 IEEE International Symposium on Information Theory.

[18]  Dimitris S. Papailiopoulos,et al.  Locally Repairable Codes , 2014, IEEE Trans. Inf. Theory.