P2P storage systems: Study of different placement policies

In a P2P storage system using erasure codes, a data block is encoded in many redundancy fragments. These fragments are then sent to distinct peers of the network. In this work, we study the impact of different placement policies of these fragments on the performance of storage systems. Several practical factors (easier control, software reuse, latency) tend to favor data placement strategies that preserve some degree of locality. We compare three policies: two of them are local, in which the data are stored in logical neighbors, and the other one, global, in which the data are spread randomly in the whole system. We focus on the study of the probability to lose a data block and the bandwidth consumption to maintain such redundancy. We use simulations to show that, without resource constraints, the average values are the same no matter which placement policy is used. However, the variations in the use of bandwidth are much more bursty under the local policies. When the bandwidth is limited, these bursty variations induce longer maintenance time and henceforth a higher risk of data loss. We then show that a suitable degree of locality could be introduced in order to combine the efficiency of the global policy with the practical advantages of a local placement. Additionally, we propose a new external reconstruction strategy that greatly improves the performance of local placement strategies. Finally, we give analytical methods to estimate the mean time to the occurrence of data loss for the three policies.

[1]  de Ng Dick Bruijn A combinatorial problem , 1946 .

[2]  Tadeusz Drwiega,et al.  Managing Traffic Performance in Converged Networks, 20th International Teletraffic Congress, ITC20 2007, Ottawa, Canada, June 17-21, 2007, Proceedings , 2007, International Teletraffic Congress.

[3]  Stefan Savage,et al.  Total Recall: System Support for Automated Availability Management , 2004, NSDI.

[4]  Antony I. T. Rowstron,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001, SOSP.

[5]  John Kubiatowicz,et al.  Erasure Coding Vs. Replication: A Quantitative Comparison , 2002, IPTPS.

[6]  Robert Tappan Morris,et al.  Designing a DHT for Low Latency and High Throughput , 2004, NSDI.

[7]  Anne-Marie Kermarrec,et al.  Availability-Based Methods for Distributed Storage Systems , 2012, 2012 IEEE 31st Symposium on Reliable Distributed Systems.

[8]  Pierre Sens,et al.  Churn-Resilient Replication Strategy for Peer-to-Peer Distributed Hash-Tables , 2009, SSS.

[9]  Magnus Karlsson,et al.  Do We Need Replica Placement Algorithms in Content Delivery Networks , 2002 .

[10]  Christopher Batten,et al.  pStore: A Secure Peer-to-Peer Backup System∗ , 2007 .

[11]  Andreas Haeberlen,et al.  Efficient Replica Maintenance for Distributed Storage Systems , 2006, NSDI.

[12]  Ben Y. Zhao,et al.  OceanStore: an architecture for global-scale persistent storage , 2000, SIGP.

[13]  Wei Chen,et al.  On the Impact of Replica Placement to the Reliability of Distributed Brick Storage Systems , 2005, 25th IEEE International Conference on Distributed Computing Systems (ICDCS'05).

[14]  Krzysztof Rzadca,et al.  Replica Placement in P2P Storage: Complexity and Game Theoretic Analyses , 2010, 2010 IEEE 30th International Conference on Distributed Computing Systems.

[15]  Efficient Reliable Internet Storage ∗ , 2004 .

[16]  Andreas Haeberlen,et al.  Glacier: highly durable, decentralized storage despite massive correlated failures , 2005, NSDI.

[17]  Marvin Theimer,et al.  Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs , 2000, SIGMETRICS '00.

[18]  Roger Wattenhofer,et al.  Competitive Hill-Climbing Strategies for Replica Placement in a Distributed File System , 2001, DISC.

[19]  Pierre Sens,et al.  Predicting durability in DHTs using Markov chains , 2007, 2007 2nd International Conference on Digital Information Management.

[20]  Andrew V. Goldberg,et al.  Towards an archival Intermemory , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.

[21]  Stéphane Pérennes,et al.  Analysis of failure correlation impact on peer-to-peer storage systems , 2009, 2009 IEEE Ninth International Conference on Peer-to-Peer Computing.

[22]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[23]  Dorian Mazauric,et al.  P2P Storage Systems: Data Life Time for Different Placement Policies , 2010 .

[24]  Salma Ktari,et al.  Performance evaluation of replication strategies in DHTs under churn , 2007, MUM.

[25]  Dorian Mazauric,et al.  Well‐Balanced Designs for Data Placement , 2016 .

[26]  Stéphane Pérennes,et al.  Peer-to-Peer Storage Systems: A Practical Guideline to be Lazy , 2010, 2010 IEEE Global Telecommunications Conference GLOBECOM 2010.

[27]  Daniel A. Spielman,et al.  Practical loss-resilient codes , 1997, STOC '97.

[28]  Samuel Bernard,et al.  Optimizing peer-to-peer backup using lifetime estimations , 2009, EDBT/ICDT '09.

[29]  GhemawatSanjay,et al.  The Google file system , 2003 .

[30]  Joseph Pasquale,et al.  Analysis of Long-Running Replicated Systems , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[31]  Abdulhalim Dandoush,et al.  Performance Analysis of Peer-to-Peer Storage Systems , 2007, ITC.

[32]  Robbert van Renesse,et al.  Chain Replication for Supporting High Throughput and Availability , 2004, OSDI.

[33]  Stéphane Pérennes,et al.  P2P storage systems: How much locality can they tolerate? , 2009, 2009 IEEE 34th Conference on Local Computer Networks.

[34]  Antony I. T. Rowstron,et al.  PAST: a large-scale, persistent peer-to-peer storage utility , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[35]  Rodrigo Rodrigues,et al.  High Availability in DHTs: Erasure Coding vs. Replication , 2005, IPTPS.

[36]  Sheldon M. Ross Introduction to Probability Models. , 1995 .

[37]  Sheldon M. Ross,et al.  Introduction to Probability Models, Eighth Edition , 1972 .

[38]  David R. Karger,et al.  Analysis of the evolution of peer-to-peer systems , 2002, PODC '02.

[39]  Roger Wattenhofer,et al.  Large-scale simulation of replica placement algorithms for a serverless distributed file system , 2001, MASCOTS 2001, Proceedings Ninth International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[40]  Sheldon M. Ross,et al.  Introduction to Probability Models (4th ed.). , 1990 .

[41]  Frédéric Giroire,et al.  Hybrid Approaches for Distributed Storage Systems , 2011, Globe.

[42]  David R. Karger,et al.  Wide-area cooperative storage with CFS , 2001, SOSP.