Optimal systematic distributed storage codes with fast encoding

We consider the problem of constructing explicit erasure codes for distributed storage with the following desirable properties motivated by system constraints: (i) Maximum-Distance-Separable (MDS), (ii)Optimal repair-bandwidth, (iii)Flexibility in repair (as will be described), (iv) Systematic Form, and (v) Fast encoding (enabled by a sparse generator matrix). Existing constructions in the literature satisfy only strict subsets of these desired properties. This paper presents the first explicit code construction which theoretically guarantees all the five desired properties simultaneously. We first present a construction that builds on Product-Matrix (PM) codes by enabling sparsity in its generator matrix. We then present a transformation for general classes of storage and repair optimal codes to enable fast encoding through sparsity. In practice, such sparse codes are roughly 7 times sparser than their standard counterparts, and result in encoding speedup by a factor of about 4 for typical parameters.

[1]  Yunghsiang Sam Han,et al.  A Unified Form of Exact-MSR Codes via Product-Matrix Frameworks , 2015, IEEE Transactions on Information Theory.

[2]  Kannan Ramchandran,et al.  DRESS codes for the storage cloud: Simple randomized constructions , 2011, 2011 IEEE International Symposium on Information Theory Proceedings.

[3]  Ulas C. Kozat,et al.  FAST CLOUD: Pushing the Envelope on Delay Performance of Cloud Storage With Coding , 2013, IEEE/ACM Transactions on Networking.

[4]  Dimitris S. Papailiopoulos,et al.  Repair Optimal Erasure Codes Through Hadamard Designs , 2011, IEEE Transactions on Information Theory.

[5]  Cheng Huang,et al.  Polynomial length MDS codes with optimal repair in distributed storage , 2011, 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[6]  M. Darnell,et al.  Error Control Coding: Fundamentals and Applications , 1985 .

[7]  Kannan Ramchandran,et al.  Explicit construction of optimal exact regenerating codes for distributed storage , 2009, 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[8]  Scott Shenker,et al.  Why Let Resources Idle? Aggressive Cloning of Jobs with Dolly , 2012, HotCloud.

[9]  Alexandros G. Dimakis,et al.  Network Coding for Distributed Storage Systems , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[10]  GhemawatSanjay,et al.  The Google file system , 2003 .

[11]  Hidenori Kuwakado,et al.  Generalization of Rashmi-Shah-Kumar Minimum-Storage-Regenerating Codes , 2013, ArXiv.

[12]  Jehoshua Bruck,et al.  Zigzag Codes: MDS Array Codes With Optimal Rebuilding , 2011, IEEE Transactions on Information Theory.

[13]  Kannan Ramchandran,et al.  Distributed Storage Codes With Repair-by-Transfer and Nonachievability of Interior Points on the Storage-Bandwidth Tradeoff , 2010, IEEE Transactions on Information Theory.

[14]  Kannan Ramchandran,et al.  Explicit and optimal exact-regenerating codes for the minimum-bandwidth point in distributed storage , 2010, 2010 IEEE International Symposium on Information Theory.

[15]  Kenneth W. Shum,et al.  Analysis and construction of functional regenerating codes with uncoded repair for distributed storage systems , 2013, 2013 Proceedings IEEE INFOCOM.

[16]  Kannan Ramchandran,et al.  Fractional repetition codes for repair in distributed storage systems , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[17]  Kannan Ramchandran,et al.  Having Your Cake and Eating It Too: Jointly Optimal Erasure Codes for I/O, Storage, and Network-bandwidth , 2015, FAST.

[18]  Anne-Marie Kermarrec,et al.  Regenerating Codes: A System Perspective , 2012, 2012 IEEE 31st Symposium on Reliable Distributed Systems.

[19]  Nicolas Le Scouarnec Fast Product-Matrix Regenerating Codes , 2014, ArXiv.

[20]  Nihar B. Shah,et al.  Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction , 2010, IEEE Transactions on Information Theory.

[21]  Dimitris S. Papailiopoulos,et al.  XORing Elephants: Novel Erasure Codes for Big Data , 2013, Proc. VLDB Endow..

[22]  Kannan Ramchandran,et al.  A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster , 2013, HotStorage.

[23]  Garth A. Gibson,et al.  DiskReduce: RAID for data-intensive scalable computing , 2009, PDSW '09.

[24]  Brighten Godfrey,et al.  More is less: reducing latency via redundancy , 2012, HotNets-XI.

[25]  Kannan Ramchandran,et al.  A "hitchhiker's" guide to fast and efficient data reconstruction in erasure-coded data centers , 2015, SIGCOMM 2015.

[26]  Luiz André Barroso,et al.  The tail at scale , 2013, CACM.

[27]  Kannan Ramchandran,et al.  Interference Alignment in Regenerating Codes for Distributed Storage: Necessity and Code Constructions , 2010, IEEE Transactions on Information Theory.