A realistic simulation model for peer-to-peer storage systems

The peer-to-peer (P2P) paradigm have emerged as a cheap, scalable, self-repairing and fault-tolerant storage solution. This solution relies on erasure codes to generate additional redundant fragments of each "block of data" in order to increase the reliability and availability and overcome the churn. When the amount of unreachable fragments attains a predefined threshold, due to permanent departures or long disconnections of peers, a recovery process is initiated to compensate the missing fragments, requiring multiple fragments of data of a given "block" to be downloaded in parallel for an enhanced service. Recent modeling efforts that address the availability and the durability of data have assumed the recovery process to follow an exponential distribution, an assumption made mainly in the absence of studies characterizing the "real" distribution of the recovery process. This work aims at filling this gap and better understanding the behavior of these systems through simulation while taking into consideration the heterogeneity of peers, the underlying network topologies, the propagation delays and the transport protocol. To that end, the distributed storage protocol is implemented in the NS-2 network simulator. This paper describes a realistic simulation model that captures the behavior of P2P storage systems. We provide some experiments results that show how modeling the availability and durability can be impacted by the recovery times distribution which is impacted in turn by the characteristics of the the network and the context.

[1]  Joseph Pasquale,et al.  Analysis of Long-Running Replicated Systems , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[2]  Ravi Jain,et al.  An Experimental Study of the Skype Peer-to-Peer VoIP System , 2005, IPTPS.

[3]  Anja Feldmann,et al.  Reflecting P2P User Behaviour Models in a Simulation Environment , 2008, 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008).

[4]  Qi He,et al.  Mapping peer behavior to packet-level details: a framework for packet-level simulation of peer-to-peer systems , 2003, 11th IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer Telecommunications Systems, 2003. MASCOTS 2003..

[5]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[6]  Abdulhalim Dandoush,et al.  Performance Analysis of Centralized versus Distributed Recovery Schemes in P2P Storage Systems , 2009, Networking.

[7]  Ibrahim Matta,et al.  BRITE: Boston University Representative Internet Topology gEnerator: A Flexible Generator of Internet Topologies , 2000 .

[8]  BERNARD M. WAXMAN,et al.  Routing of multipoint connections , 1988, IEEE J. Sel. Areas Commun..

[9]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[10]  Abdulhalim Dandoush,et al.  Performance Analysis of Peer-to-Peer Storage Systems , 2007, ITC.

[11]  Tobias Hoßfeld,et al.  Efficient simulation of large-scale p2p networks: packet-level vs. flow-level simulations , 2007, UPGRADE '07.

[12]  Kenneth L. Calvert,et al.  Modeling Internet topology , 1997, IEEE Commun. Mag..

[13]  F. Massey The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .

[14]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[15]  Peter G. Harrison,et al.  Queueing models of RAID systems with maxima of waiting times , 2007, Perform. Evaluation.

[16]  Thomas E. Anderson,et al.  Leveraging BitTorrent for End Host Measurements , 2007, PAM.

[17]  Abdulhalim Dandoush,et al.  Simulation analysis of download and recovery processes in P2P storage systems , 2009, 2009 21st International Teletraffic Congress.