Efficient Reliability in Volunteer Storage Systems with Random Linear Coding

Volunteer systems pose difficult challenges for data storage. Because of the extremely low reliability of volunteer nodes, these systems require so high redundancy that replication is infeasible. Erasure coding has been proposed to cope with this problem as it needs much less redundancy to achieve the same reliability. Its downside is that the reparation of the system creates high overhead, as fully decoding the original data is required to generate new coded data. Random linear coding has been proposed to be used as a data storage method, as it provides a better redundancy/reliability ratio, and less control overhead. We propose that it also helps in the reparation of the system, as decoding is not required; instead, coded data can be generated from already existing coded data. However, it may be possible that this iterative reparation leads to degradation of data over time; even more so, if sparse coding is used to increase compute efficiency. This paper examines the effects of random linear coding and the iterative reparation of the system. It shows the reliability that can be achieved with random linear coding in a highly volatile distributed system. We conclude that random linear coding can achieve high reliability even in highly volatile systems.

[1]  David P. Anderson,et al.  BOINC: a system for public-resource computing and storage , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[2]  Gilles Fedak,et al.  The Computational and Storage Potential of Volunteer Computing , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).

[3]  Tracey Ho,et al.  A Random Linear Network Coding Approach to Multicast , 2006, IEEE Transactions on Information Theory.

[4]  Rudolf Ahlswede,et al.  Network information flow , 2000, IEEE Trans. Inf. Theory.

[5]  Shuo-Yen Robert Li,et al.  Linear network coding , 2003, IEEE Trans. Inf. Theory.

[6]  Yunnan Wu,et al.  A Survey on Network Codes for Distributed Storage , 2010, Proceedings of the IEEE.

[7]  Alexandros G. Dimakis,et al.  Network Coding for Distributed Storage Systems , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[8]  Baochun Li,et al.  How Practical is Network Coding? , 2006, 200614th IEEE International Workshop on Quality of Service.

[9]  Joong Bum Rhim,et al.  Fountain Codes , 2010 .

[10]  Yinlong Xu,et al.  A Content Distribution System based on Sparse Linear Network Coding , 2006 .

[11]  John Kubiatowicz,et al.  Erasure Coding Vs. Replication: A Quantitative Comparison , 2002, IPTPS.

[12]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[13]  Rodrigo Rodrigues,et al.  High Availability in DHTs: Erasure Coding vs. Replication , 2005, IPTPS.

[14]  Péter Kacsuk,et al.  Efficient Random Network Coding for Distributed Storage Systems , 2013, Euro-Par Workshops.

[15]  Jörg Widmer,et al.  Network coding: an instant primer , 2006, CCRV.

[16]  Muriel Medard,et al.  How good is random linear coding based distributed networked storage , 2005 .