On the I/O Costs of Some Repair Schemes for Full-Length Reed-Solomon Codes

Network transfer and disk read are the most time consuming operations in the repair process for node failures in erasure-code-based distributed storage systems. Recent developments on Reed-Solomon codes, the most widely used erasure codes in practical storage systems, have shown that efficient repair schemes specifically tailored to these codes can significantly reduce the network bandwidth spent to recover single failures. However, the I/O cost, that is, the number of disk reads performed in these repair schemes remains largely unknown. We take the first step to address this gap in the literature by investigating the I/O costs of some existing repair schemes for full-length Reed-Solomon codes

[1]  Alexander Barg,et al.  Explicit constructions of MDS array codes and RS codes with optimal repair bandwidth , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[2]  Alexander Barg,et al.  Repairing Reed-Solomon codes: Universally achieving the cut-set bound for any number of erasures , 2017, ArXiv.

[3]  Han Mao Kiah,et al.  Repairing Reed-Solomon Codes With Multiple Erasures , 2016, IEEE Transactions on Information Theory.

[4]  Itzhak Tamo,et al.  Optimal Repair of Reed-Solomon Codes: Achieving the Cut-Set Bound , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[5]  Cheng Huang,et al.  Rethinking erasure codes for cloud file systems: minimizing I/O for recovery and degraded reads , 2012, FAST.

[6]  Alexander Vardy,et al.  Improved schemes for asymptotically optimal repair of MDS codes , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[7]  Cheng Huang,et al.  On the Locality of Codeword Symbols , 2011, IEEE Transactions on Information Theory.

[8]  Venkatesan Guruswami,et al.  Repairing Reed-Solomon Codes , 2015, IEEE Transactions on Information Theory.

[9]  Hamid Jafarkhani,et al.  A tradeoff between the sub-packetization size and the repair bandwidth for reed-solomon code , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[10]  H. Niederreiter,et al.  Introduction to finite fields and their applications: Factorization of Polynomials , 1994 .

[11]  Yunnan Wu,et al.  A Survey on Network Codes for Distributed Storage , 2010, Proceedings of the IEEE.

[12]  Alexandros G. Dimakis,et al.  Network Coding for Distributed Storage Systems , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[13]  Han Mao Kiah,et al.  Repairing reed-solomon codes with two erasures , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[14]  F. MacWilliams,et al.  The Theory of Error-Correcting Codes , 1977 .

[15]  Frédérique Oggier,et al.  Self-repairing homomorphic codes for distributed storage systems , 2010, 2011 Proceedings IEEE INFOCOM.

[16]  F. Moore,et al.  Polynomial Codes Over Certain Finite Fields , 2017 .

[17]  Hoang Dau,et al.  Low bandwidth repair of the RS(10,4) Reed-Solomon code , 2017, 2017 Information Theory and Applications Workshop (ITA).

[18]  Dimitris S. Papailiopoulos,et al.  Locally Repairable Codes , 2012, IEEE Transactions on Information Theory.

[19]  Saurabh Bagchi,et al.  Partial-parallel-repair (PPR): a distributed technique for repairing erasure coded storage , 2016, EuroSys.

[20]  Mary Wootters,et al.  Repairing multiple failures for scalar MDS codes , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[21]  Dimitris S. Papailiopoulos,et al.  A repair framework for scalar MDS codes , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).