Information-Theoretically Secure Erasure Codes for Distributed Storage

Repair operations in erasure-coded distributed storage systems involve a lot of data movement. This can potentially expose data to malicious acts of passive eavesdroppers or active adversaries, putting security of the system at risk. This paper presents coding schemes and repair algorithms that ensure security of the data in the presence of passive eavesdroppers and active adversaries while maintaining high availability, reliability, and resource efficiency in the system. The proposed codes are optimal in that they meet previously proposed lower bounds on storage and network-bandwidth requirements for a wide range of system parameters. The results thus establish the secure storage capacity of such systems. The proposed codes are based on an optimal class of codes called product-matrix codes. The constructions presented for security from active adversaries provide an additional appealing feature of “on-demand security,” where the desired level of security can be chosen separately for each instance of repair, and the proposed algorithms remain optimal simultaneously for all possible security levels. This paper also provides necessary and sufficient conditions governing the transformation of any (non-secure) code into one providing on-demand security.

[1]  Yunghsiang Sam Han,et al.  Exact regenerating codes for Byzantine fault tolerance in distributed storage , 2012, 2012 Proceedings IEEE INFOCOM.

[2]  Jon Feldman,et al.  On the Capacity of Secure Network Coding , 2004 .

[3]  Alexandros G. Dimakis,et al.  Security in distributed storage systems by communicating a logarithmic number of bits , 2010, 2010 IEEE International Symposium on Information Theory.

[4]  Sriram Vishwanath,et al.  Error resilience in distributed storage via rank-metric codes , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[5]  A. Robert Calderbank,et al.  Data secrecy in distributed storage systems under exact repair , 2013, 2013 International Symposium on Network Coding (NetCod).

[6]  Cheng Huang,et al.  Permutation code: Optimal exact-repair of a single failed node in MDS code based distributed storage systems , 2011, 2011 IEEE International Symposium on Information Theory Proceedings.

[7]  Kannan Ramchandran,et al.  Distributed Storage Codes With Repair-by-Transfer and Nonachievability of Interior Points on the Storage-Bandwidth Tradeoff , 2010, IEEE Transactions on Information Theory.

[8]  April Rasala Lehman,et al.  Complexity classification of network information flow problems , 2004, SODA '04.

[9]  Raymond W. Yeung,et al.  Secure error-correcting (SEC) network codes , 2009, 2009 Workshop on Network Coding, Theory, and Applications.

[10]  Nihar B. Shah,et al.  Information-Theoretically Secure Regenerating Codes for Distributed Storage , 2011, 2011 IEEE Global Telecommunications Conference - GLOBECOM 2011.

[11]  Frank R. Kschischang,et al.  Coding for Errors and Erasures in Random Network Coding , 2007, IEEE Transactions on Information Theory.

[12]  Frédérique E. Oggier,et al.  Byzantine fault tolerance of regenerating codes , 2011, 2011 IEEE International Conference on Peer-to-Peer Computing.

[13]  F. MacWilliams,et al.  The Theory of Error-Correcting Codes , 1977 .

[14]  Nicolas Le Scouarnec Exact scalar minimum storage coordinated regenerating codes , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.

[15]  Kannan Ramchandran,et al.  Exact-Repair MDS Code Construction Using Interference Alignment , 2011, IEEE Transactions on Information Theory.

[16]  P. Vijay Kumar,et al.  An improved outer bound on the storage-repair-bandwidth tradeoff of exact-repair regenerating codes , 2013, 2014 IEEE International Symposium on Information Theory.

[17]  Kannan Ramchandran,et al.  Explicit and optimal exact-regenerating codes for the minimum-bandwidth point in distributed storage , 2010, 2010 IEEE International Symposium on Information Theory.

[18]  Cheng Huang,et al.  Polynomial length MDS codes with optimal repair in distributed storage , 2011, 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[19]  Soheil Mohajer,et al.  New bounds on the (n, k, d) storage systems with exact repair , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[20]  Oliver Kosut Polytope codes for distributed storage in the presence of an active omniscient adversary , 2013, 2013 IEEE International Symposium on Information Theory.

[21]  Chau Yuen,et al.  On block security of regenerating codes at the MBR point for distributed storage systems , 2014, 2014 IEEE International Symposium on Information Theory.

[22]  P. Vijay Kumar,et al.  High-rate regenerating codes through layering , 2013, 2013 IEEE International Symposium on Information Theory.

[23]  Camilla Hollanti,et al.  Capacity and Security of Heterogeneous Distributed Storage Systems , 2013, IEEE Journal on Selected Areas in Communications.

[24]  Kannan Ramchandran,et al.  Explicit construction of optimal exact regenerating codes for distributed storage , 2009, 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[25]  Scott Shenker,et al.  Why Let Resources Idle? Aggressive Cloning of Jobs with Dolly , 2012, HotCloud.

[26]  Nihar B. Shah,et al.  Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction , 2010, IEEE Transactions on Information Theory.

[27]  A. Robert Calderbank,et al.  Can linear minimum storage regenerating codes be universally secure? , 2015, 2015 49th Asilomar Conference on Signals, Systems and Computers.

[28]  Dimitris S. Papailiopoulos,et al.  XORing Elephants: Novel Erasure Codes for Big Data , 2013, Proc. VLDB Endow..

[29]  Kannan Ramchandran,et al.  Regenerating codes for errors and erasures in distributed storage , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.

[30]  Kannan Ramchandran,et al.  A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster , 2013, HotStorage.

[31]  Dennis S. Bernstein,et al.  Matrix Mathematics: Theory, Facts, and Formulas with Application to Linear Systems Theory , 2005 .

[32]  R. Michael Buehrer,et al.  Toward Optimal Secure Distributed Storage Systems With Exact Repair , 2016, IEEE Transactions on Information Theory.

[33]  Kannan Ramchandran,et al.  One extra bit of download ensures perfectly private information retrieval , 2014, 2014 IEEE International Symposium on Information Theory.

[34]  Jehoshua Bruck,et al.  MDS array codes with optimal rebuilding , 2011, 2011 IEEE International Symposium on Information Theory Proceedings.

[35]  Ning Cai,et al.  Secure Network Coding on a Wiretap Network , 2011, IEEE Transactions on Information Theory.

[36]  Emina Soljanin,et al.  On the Delay-Storage Trade-Off in Content Download from Coded Distributed Storage Systems , 2013, IEEE Journal on Selected Areas in Communications.

[37]  Tracey Ho,et al.  Resilient network coding in the presence of Byzantine adversaries , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[38]  Kannan Ramchandran,et al.  Explicit codes minimizing repair bandwidth for distributed storage , 2009, 2010 IEEE Information Theory Workshop on Information Theory (ITW 2010, Cairo).

[39]  Frank R. Kschischang,et al.  A Rank-Metric Approach to Error Control in Random Network Coding , 2007, IEEE Transactions on Information Theory.

[40]  Kannan Ramchandran,et al.  A “Hitchhiker’s” Guide to Fast and Efficient Data Reconstruction in Erasure-coded Data Centers , 2014 .

[41]  Kannan Ramchandran,et al.  Having Your Cake and Eating It Too: Jointly Optimal Erasure Codes for I/O, Storage, and Network-bandwidth , 2015, FAST.

[42]  Chao Tian,et al.  Exact-repair regenerating codes via layered erasure correction and block designs , 2013, 2013 IEEE International Symposium on Information Theory.

[43]  Hidenori Kuwakado,et al.  Generalization of Rashmi-Shah-Kumar Minimum-Storage-Regenerating Codes , 2013, ArXiv.

[44]  Nicolas Le Scouarnec,et al.  CROSS-MBCR: Exact Minimum Bandwith Coordinated Regenerating Codes , 2012, ArXiv.

[45]  Kannan Ramchandran,et al.  A Piggybacking Design Framework for Read-and Download-Efficient Distributed Storage Codes , 2017, IEEE Transactions on Information Theory.

[46]  Dimitris S. Papailiopoulos,et al.  Repair Optimal Erasure Codes Through Hadamard Designs , 2011, IEEE Transactions on Information Theory.

[47]  Sriram Vishwanath,et al.  Adversarial Error Resilience in Distributed Storage Using MRD Codes and MDS Array Codes , 2012, ArXiv.

[48]  Jaume Pujol,et al.  Quasi-cyclic Flexible Regenerating Codes , 2012 .

[49]  SkoglundMikael,et al.  Optimal-cost repair in multi-hop distributed storage systems with network coding , 2016 .

[50]  Jehoshua Bruck,et al.  Long MDS codes for optimal repair bandwidth , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.

[51]  Lang Tong,et al.  Nonlinear network coding is necessary to combat general Byzantine attacks , 2009, 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[52]  Adi Shamir,et al.  How to share a secret , 1979, CACM.

[53]  Chao Tian Rate region of the (4, 3, 3) exact-repair regenerating codes , 2013, 2013 IEEE International Symposium on Information Theory.

[54]  Michael Langberg,et al.  Network Codes Resilient to Jamming and Eavesdropping , 2010, IEEE/ACM Transactions on Networking.

[55]  GhemawatSanjay,et al.  The Google file system , 2003 .

[56]  Kenneth W. Shum,et al.  Exact minimum-repair-bandwidth cooperative regenerating codes for distributed storage systems , 2011, 2011 IEEE International Symposium on Information Theory Proceedings.

[57]  Kannan Ramchandran,et al.  EC-Cache: Load-Balanced, Low-Latency Cluster Caching with Online Erasure Coding , 2016, OSDI.

[58]  Kenneth W. Shum,et al.  Analysis and construction of functional regenerating codes with uncoded repair for distributed storage systems , 2013, 2013 Proceedings IEEE INFOCOM.

[59]  Sriram Vishwanath,et al.  Progress on high-rate MSR codes: Enabling arbitrary number of helper nodes , 2016, 2016 Information Theory and Applications Workshop (ITA).

[60]  Jie Li,et al.  A Framework of Constructions of Minimum Storage Regenerating Codes with the Optimal Update/Access Property for Distributed Storage Systems Based on Invariant Subspace Technique , 2013, ArXiv.

[61]  Swanand Kadhe,et al.  On a weakly secure regenerating code construction for minimum storage regime , 2014, 2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[62]  N. Prakash,et al.  The storage-repair-bandwidth trade-off of exact repair linear regenerating codes for the case d = k = n − 1 , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[63]  Van-Anh Truong,et al.  Availability in Globally Distributed Storage Systems , 2010, OSDI.

[64]  Kannan Ramchandran,et al.  Interference Alignment in Regenerating Codes for Distributed Storage: Necessity and Code Constructions , 2010, IEEE Transactions on Information Theory.

[65]  Ming Xian,et al.  On Secrecy Capacity of Minimum Storage Regenerating Codes , 2015, IEEE Transactions on Information Theory.

[66]  Sriram Vishwanath,et al.  Optimal Locally Repairable and Secure Codes for Distributed Storage Systems , 2012, IEEE Transactions on Information Theory.

[67]  Minghua Chen,et al.  BASIC regenerating code: Binary addition and shift for exact repair , 2013, 2013 IEEE International Symposium on Information Theory.

[68]  Kannan Ramchandran,et al.  Securing Dynamic Distributed Storage Systems Against Eavesdropping and Adversarial Attacks , 2010, IEEE Transactions on Information Theory.

[69]  Jehoshua Bruck,et al.  Zigzag Codes: MDS Array Codes With Optimal Rebuilding , 2011, IEEE Transactions on Information Theory.

[70]  Lawrence H. Ozarow,et al.  Wire-tap channel II , 1984, AT&T Bell Lab. Tech. J..

[71]  Nihar B. Shah On Minimizing Data-Read and Download for Storage-Node Recovery , 2012, IEEE Communications Letters.

[72]  Wei-Ho Chung,et al.  Novel Repair-by-Transfer Codes and Systematic Exact-MBR Codes with Lower Complexities and Smaller Field Sizes , 2014, IEEE Transactions on Parallel and Distributed Systems.

[73]  Cong Shen,et al.  On the Tradeoff Region of Secure Exact-Repair Regenerating Codes , 2017, IEEE Transactions on Information Theory.

[74]  Brighten Godfrey,et al.  More is less: reducing latency via redundancy , 2012, HotNets-XI.

[75]  Arman Fazeli,et al.  Minimum storage regenerating codes for all parameters , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[76]  Jie Li,et al.  A Framework of Constructions of Minimal Storage Regenerating Codes With the Optimal Access/Update Property , 2013, IEEE Transactions on Information Theory.

[77]  Luiz André Barroso,et al.  The tail at scale , 2013, CACM.

[78]  Alexandros G. Dimakis,et al.  Network Coding for Distributed Storage Systems , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[79]  Eyal Kushilevitz,et al.  Private information retrieval , 1998, JACM.