Peer-to-Peer Storage Systems: A Practical Guideline to be Lazy

Distributed and peer-to-peer storage systems are foreseen as an alternative to the traditional data centers and in-house backup solutions. In the past few years many peer-to- peer storage systems have been proposed. Most of them rely on the use of erasure codes to introduce redundancy to the data. This kind of system depends on many parameters that need to be well tuned, such as the factor of redundancy, the frequency of data repair and the size of a data block. In this paper we give closed-form mathematical expressions that estimate the system average behavior. These expressions are derived from a Markov chain. Our contribution is a guideline to system designers and administrators to choose the best set of parameters. That is, how to tune the system parameters to obtain a desired level of reliability under a given constraint of bandwidth consumption. We confirm that a lazy repair strategy can be employed to amortize the repairing cost. Moreover, we propose a formula to calculate the optimal threshold value that minimizes the bandwidth consumption. Finally, we additionally discuss the impact of different system characteristics on the performance metrics, such as the number of peers, the amount of stored data, and the disk failure rate. To the best of our knowledge this is the first work to give close-form formulas to estimate the bandwidth consumption for a lazy repair, and the loss rate taking into account the repair time.

[1]  Joseph Pasquale,et al.  Analysis of Long-Running Replicated Systems , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[2]  Daniel A. Spielman,et al.  Practical loss-resilient codes , 1997, STOC '97.

[3]  Karl Aberer,et al.  Internet-Scale Storage Systems under Churn -- A Study of the Steady-State using Markov Models , 2006, Sixth IEEE International Conference on Peer-to-Peer Computing (P2P'06).

[4]  Alexandros G. Dimakis,et al.  Network Coding for Distributed Storage Systems , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[5]  Pierre Sens,et al.  Pastis: A Highly-Scalable Multi-user Peer-to-Peer File System , 2005, Euro-Par.

[6]  Brian Warner,et al.  Tahoe: the least-authority filesystem , 2008, StorageSS '08.

[7]  Stéphane Pérennes,et al.  Analysis of failure correlation impact on peer-to-peer storage systems , 2009, 2009 IEEE Ninth International Conference on Peer-to-Peer Computing.

[8]  Stefan Savage,et al.  Total Recall: System Support for Automated Availability Management , 2004, NSDI.

[9]  DruschelPeter,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001 .

[10]  John Kubiatowicz,et al.  Erasure Coding Vs. Replication: A Quantitative Comparison , 2002, IPTPS.

[11]  Marvin Theimer,et al.  Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs , 2000, SIGMETRICS '00.

[12]  Rodrigo Rodrigues,et al.  High Availability in DHTs: Erasure Coding vs. Replication , 2005, IPTPS.

[13]  Antony I. T. Rowstron,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001, SOSP.

[14]  Abdulhalim Dandoush,et al.  Performance Analysis of Peer-to-Peer Storage Systems , 2007, ITC.

[15]  Andreas Haeberlen,et al.  Efficient Replica Maintenance for Distributed Storage Systems , 2006, NSDI.

[16]  D. M. Chiu,et al.  Erasure code replication revisited , 2004, Proceedings. Fourth International Conference on Peer-to-Peer Computing, 2004. Proceedings..