Maximally Recoverable Codes for Grid-like Topologies

The explosion in the volumes of data being stored online has resulted in distributed storage systems transitioning to erasure coding based schemes. Yet, the codes being deployed in practice are fairly short. In this work, we address what we view as the main coding theoretic barrier to deploying longer codes in storage: at large lengths, failures are not independent and correlated failures are inevitable. This motivates designing codes that allow quick data recovery even after large correlated failures, and which have efficient encoding and decoding. We propose that code design for distributed storage be viewed as a two-step process. The first step is choose a topology of the code, which incorporates knowledge about the correlated failures that need to be handled, and ensures local recovery from such failures. In the second step one specifies a code with the chosen topology by choosing coefficients from a finite field. In this step, one tries to balance reliability (which is better over larger fields) with encoding and decoding efficiency (which is better over smaller fields). This work initiates an in-depth study of this reliability/efficiency tradeoff. We consider the field-size needed for achieving maximal recoverability: the strongest reliability possible with a given topology. We propose a family of topologies called grid-like topologies which unify a number of topologies considered both in theory and practice, and prove a collection of results about maximally recoverable codes in such topologies including the first super-polynomial lower bound on the field size.

[1]  Cheng Huang,et al.  Explicit Maximally Recoverable Codes With Locality , 2013, IEEE Transactions on Information Theory.

[2]  Jan De Beule,et al.  On sets of vectors of a finite vector space in which every subset of basis size is a basis II , 2012, Des. Codes Cryptogr..

[3]  Shubhangi Saraf,et al.  Locally Decodable Codes , 2016, Encyclopedia of Algorithms.

[4]  O. Antoine,et al.  Theory of Error-correcting Codes , 2022 .

[5]  Dimitris S. Papailiopoulos,et al.  XORing Elephants: Novel Erasure Codes for Big Data , 2013, Proc. VLDB Endow..

[6]  Mario Blaum Construction of PMDS and SD Codes extending RAID 5 , 2013, ArXiv.

[7]  Cheng Huang,et al.  On the Locality of Codeword Symbols , 2011, IEEE Transactions on Information Theory.

[8]  Simeon Ball,et al.  On sets of vectors of a finite vector space in which every subset of basis size is a basis II , 2012, Designs, Codes and Cryptography.

[9]  Minghua Chen,et al.  Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems , 2007, Sixth IEEE International Symposium on Network Computing and Applications (NCA 2007).

[10]  BallSimeon,et al.  On sets of vectors of a finite vector space in which every subset of basis size is a basis II , 2012 .

[11]  Read Download Random Error And Burst Correction By Iterated Codes , 2015 .

[12]  Cory Hill,et al.  f4: Facebook's Warm BLOB Storage System , 2014, OSDI.

[13]  Hans Georg Schaathun,et al.  A Lower Bound on the Weight Hierarchies of Product Codes , 2003, Discret. Appl. Math..

[14]  Cheng Huang,et al.  Erasure Coding in Windows Azure Storage , 2012, USENIX Annual Technical Conference.

[15]  Satyanarayana V. Lokam,et al.  Weight enumerators and higher support weights of maximally recoverable codes , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[16]  Itzhak Tamo,et al.  A Family of Optimal Locally Recoverable Codes , 2013, IEEE Transactions on Information Theory.

[17]  Emeric Gioan,et al.  Combinatorial geometries: Matroids, oriented matroids and applications. Special issue in memory of Michel Las Vergnas , 2015, Eur. J. Comb..

[18]  Victor K.-W. Wei,et al.  On the generalized Hamming weights of product codes , 1993, IEEE Trans. Inf. Theory.

[19]  Stasys Jukna,et al.  Extremal Combinatorics , 2001, Texts in Theoretical Computer Science. An EATCS Series.

[20]  P. Vijay Kumar,et al.  Optimal linear codes with a local-error-correction property , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.

[21]  Guangda Hu,et al.  New constructions of SD and MR codes over small finite fields , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[22]  Thomas Thierauf,et al.  Bipartite perfect matching is in quasi-NC , 2016, STOC.

[23]  Yunnan Wu,et al.  Network coding for distributed storage systems , 2010, IEEE Trans. Inf. Theory.

[24]  Ethan L. Miller,et al.  Screaming fast Galois field arithmetic using intel SIMD instructions , 2013, FAST.

[25]  Mario Blaum,et al.  Partial-MDS Codes and Their Application to RAID Type of Architectures , 2012, IEEE Transactions on Information Theory.

[26]  Balaji Srinivasan Babu,et al.  On partial maximally-recoverable and maximally-recoverable codes , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[27]  Eitan Yaakobi,et al.  Construction of Partial MDS and Sector-Disk Codes With Two Global Parity Symbols , 2016, IEEE Transactions on Information Theory.

[28]  Minghua Chen,et al.  On the Maximally Recoverable Property for Multi-Protection Group Codes , 2007, 2007 IEEE International Symposium on Information Theory.