Storage Codes With Flexible Number of Nodes

This paper presents flexible storage codes, a class of error-correcting codes that can recover information from a flexible number of storage nodes. As a result, one can make a better use of the available storage nodes in the presence of unpredictable node failures and reduce the data access latency. Let us assume a storage system encodes kl information symbols over a finite field F into n nodes, each of size l symbols. The code is parameterized by a set of tuples {(Rj , kj , lj) : 1 ≤ j ≤ a}, satisfying k1l1 = k2l2 = ... = kala and k1 > k2 > ... > ka = k, la = l, such that the information symbols can be reconstructed from any Rj nodes, each node accessing lj symbols. In other words, the code allows a flexible number of nodes for decoding to accommodate the variance in the data access time of the nodes. Code constructions are presented for different storage scenarios, including LRC (locally recoverable) codes, PMDS (partial MDS) codes, and MSR (minimum storage regenerating) codes. We analyze the latency of accessing information and perform simulations on Amazon clusters to show the efficiency of presented codes.

[1]  Sriram Vishwanath,et al.  Centralized Repair of Multiple Node Failures With Applications to Communication Efficient Secret Sharing , 2016, IEEE Transactions on Information Theory.

[2]  John Wilkes,et al.  An introduction to disk drive modeling , 1994, Computer.

[3]  Venkatesan Guruswami,et al.  Repairing Reed-Solomon Codes , 2015, IEEE Transactions on Information Theory.

[4]  Nuwan S. Ferdinand,et al.  Hierarchical Coded Computation , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[5]  Anindya Bijoy Das,et al.  C3LES: Codes for Coded Computation that Leverage Stragglers , 2018, 2018 IEEE Information Theory Workshop (ITW).

[6]  Ian Goldberg,et al.  Optimally Robust Private Information Retrieval , 2012, USENIX Security Symposium.

[7]  Li Tang,et al.  Universally Decodable Matrices for Distributed Matrix-Vector Multiplication , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[8]  Mario Blaum Multiple-Layer Integrated Interleaved Codes: A Class of Hierarchical Locally Recoverable Codes , 2020, ArXiv.

[9]  Ashwin Ganesan,et al.  On the Existence of Universally Decodable Matrices , 2006, IEEE Transactions on Information Theory.

[10]  Camilla Hollanti,et al.  Robust Private Information Retrieval from Coded Systems with Byzantine and Colluding Servers , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[11]  Cheng Huang,et al.  On the Locality of Codeword Symbols , 2011, IEEE Transactions on Information Theory.

[12]  Alexander Vardy,et al.  MDS array codes with independent parity symbols , 1995, Proceedings of 1995 IEEE International Symposium on Information Theory.

[13]  Xinmiao Zhang Modified Generalized Integrated Interleaved Codes for Local Erasure Recovery , 2017, IEEE Communications Letters.

[14]  Kannan Ramchandran,et al.  Speeding Up Distributed Machine Learning Using Codes , 2015, IEEE Transactions on Information Theory.

[15]  Amos Beimel,et al.  Robust Information-Theoretic Private Information Retrieval , 2002, Journal of Cryptology.

[16]  Salim El Rouayheb,et al.  Robust private information retrieval on coded data , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[17]  Cheng Huang,et al.  Explicit Maximally Recoverable Codes With Locality , 2013, IEEE Transactions on Information Theory.

[18]  Alexandros G. Dimakis,et al.  Network Coding for Distributed Storage Systems , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[19]  Itzhak Tamo,et al.  Optimal Repair of Reed-Solomon Codes: Achieving the Cut-Set Bound , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[20]  Rong-Rong Chen,et al.  Heterogeneous Computation Assignments in Coded Elastic Computing , 2020, 2020 IEEE International Symposium on Information Theory (ISIT).

[21]  Arman Fazeli,et al.  Minimum storage regenerating codes for all parameters , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[22]  Itzhak Tamo,et al.  Error Correction Based on Partial Information , 2020, IEEE Transactions on Information Theory.

[23]  Soummya Kar,et al.  Coded Elastic Computing , 2018, 2019 IEEE International Symposium on Information Theory (ISIT).

[24]  Sergey Yekhanin,et al.  On the locality of codeword symbols in non-linear codes , 2013, Discret. Math..

[25]  Dimitris S. Papailiopoulos,et al.  Repair Optimal Erasure Codes Through Hadamard Designs , 2011, IEEE Transactions on Information Theory.

[26]  Duncan S. Wong,et al.  On Secret Reconstruction in Secret Sharing Schemes , 2008, IEEE Transactions on Information Theory.

[27]  Jehoshua Bruck,et al.  Explicit Minimum Storage Regenerating Codes , 2016, IEEE Transactions on Information Theory.

[28]  Yeow Meng Chee,et al.  Threshold changeable secret sharing schemes revisited , 2012, Theor. Comput. Sci..

[29]  Salim El Rouayheb,et al.  Staircase-PIR: Universally Robust Private Information Retrieval , 2018, 2018 IEEE Information Theory Workshop (ITW).

[30]  Emre Ozfatura,et al.  Straggler-Aware Distributed Learning: Communication–Computation Latency Trade-Off , 2020, Entropy.

[31]  Emina Soljanin,et al.  Diversity vs. Parallelism in Distributed Computing with Redundancy , 2020, 2020 IEEE International Symposium on Information Theory (ISIT).

[32]  Rong-Rong Chen,et al.  Coded Elastic Computing on Machines With Heterogeneous Storage and Computation Speed , 2020, IEEE Transactions on Communications.

[33]  Itzhak Tamo,et al.  A Family of Optimal Locally Recoverable Codes , 2013, IEEE Transactions on Information Theory.

[34]  Nihar B. Shah,et al.  Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction , 2010, IEEE Transactions on Information Theory.

[35]  O. Ozan Koyluoglu,et al.  A General Construction for PMDS Codes , 2017, IEEE Communications Letters.

[36]  Venkatesan Guruswami,et al.  ∊-MSR Codes: Contacting Fewer Code Blocks for Exact Repair , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[37]  Mario Blaum,et al.  Partial-MDS Codes and Their Application to RAID Type of Architectures , 2012, IEEE Transactions on Information Theory.

[38]  Hamid Jafarkhani,et al.  On the Sub-Packetization Size and the Repair Bandwidth of Reed-Solomon Codes , 2018, IEEE Transactions on Information Theory.

[39]  Sriram Vishwanath,et al.  Progress on high-rate MSR codes: Enabling arbitrary number of helper nodes , 2016, 2016 Information Theory and Applications Workshop (ITA).

[40]  Ulas C. Kozat,et al.  TOFEC: Achieving optimal throughput-delay trade-off of cloud storage using erasure codes , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[41]  Harald Niederreiter,et al.  Introduction to finite fields and their applications: Theoretical Applications of Finite Fields , 1994 .

[42]  M. C. Jones Kumaraswamy’s distribution: A beta-type distribution with some tractability advantages , 2009 .

[43]  F. Moore,et al.  Polynomial Codes Over Certain Finite Fields , 2017 .

[44]  Wentao Huang,et al.  Communication Efficient Secret Sharing , 2015, IEEE Transactions on Information Theory.

[45]  Hua Sun,et al.  The Capacity of Robust Private Information Retrieval With Colluding Databases , 2016, IEEE Transactions on Information Theory.

[46]  Jehoshua Bruck,et al.  Zigzag Codes: MDS Array Codes With Optimal Rebuilding , 2011, IEEE Transactions on Information Theory.

[47]  Han Mao Kiah,et al.  Repairing Reed-Solomon Codes With Multiple Erasures , 2016, IEEE Transactions on Information Theory.

[48]  Alexander Barg,et al.  Explicit Constructions of High-Rate MDS Array Codes With Optimal Repair Bandwidth , 2016, IEEE Transactions on Information Theory.

[49]  Alexander Barg,et al.  Explicit Constructions of Optimal-Access MDS Codes With Nearly Optimal Sub-Packetization , 2016, IEEE Transactions on Information Theory.

[50]  Alexander Barg,et al.  Repairing Reed-Solomon codes: Universally achieving the cut-set bound for any number of erasures , 2017, ArXiv.