Maintaining replicas in unstructured P2P systems

Replication is widely used in unstructured peer-to-peer systems to improve search or achieve availability. We identify and solve a subclass of replication problems where each object is associated with a maintainer node, and its replicas should only be available as long as its maintainer is part of the network. Such requirement can be found in various applications, e.g., when objects are directory lists, service lists, or subscriptions of a publish/subscribe system. We provide maintainers with proven guarantees on the number of replicas, in spite of network churn and crash failures. We also tackle the related problems of changing the number of replicas, updating replicas, balancing storage load in a heterogeneous network, and eliminating replicas left by crashing maintainers. Our algorithm is based on probabilistic methods and is simple to implement. We show by simulation and formal proof that our algorithm is correct.

[1]  Anne-Marie Kermarrec,et al.  Peer counting and sampling in overlay networks: random walk methods , 2006, PODC '06.

[2]  Hillol Kargupta,et al.  Uniform Data Sampling from a Peer-to-Peer Network , 2007, 27th International Conference on Distributed Computing Systems (ICDCS '07).

[3]  Stefan Savage,et al.  Total Recall: System Support for Automated Availability Management , 2004, NSDI.

[4]  Stefan Saroiu,et al.  A Measurement Study of Peer-to-Peer File Sharing Systems , 2001 .

[5]  Suresh Jagannathan,et al.  Search with probabilistic guarantees in unstructured peer-to-peer networks , 2005, Fifth IEEE International Conference on Peer-to-Peer Computing (P2P'05).

[6]  Ben Y. Zhao,et al.  OceanStore: an architecture for global-scale persistent storage , 2000, SIGP.

[7]  Jussi Kangasharju,et al.  Optimizing File Availability in Peer-to-Peer Content Distribution , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[8]  Aravind Srinivasan,et al.  Efficient lookup on unstructured topologies , 2005, IEEE Journal on Selected Areas in Communications.

[9]  Edith Cohen,et al.  Search and replication in unstructured peer-to-peer networks , 2002, ICS '02.

[10]  David R. Karger,et al.  Wide-area cooperative storage with CFS , 2001, SOSP.

[11]  Edith Cohen,et al.  Replication strategies in unstructured peer-to-peer networks , 2002, SIGCOMM.

[12]  Andreas Haeberlen,et al.  Proactive Replication for Data Durability , 2006, IPTPS.

[13]  Jussi Kangasharju,et al.  Bubblestorm: resilient, probabilistic, and exhaustive peer-to-peer search , 2007, SIGCOMM '07.

[14]  Johannes Gehrke,et al.  Gossip-based computation of aggregate information , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[15]  Vijay Gopalakrishnan,et al.  Adaptive replication in peer-to-peer systems , 2004, 24th International Conference on Distributed Computing Systems, 2004. Proceedings..

[16]  Suresh Jagannathan,et al.  Distributed Uniform Sampling in Unstructured Peer-to-Peer Networks , 2006, Proceedings of the 39th Annual Hawaii International Conference on System Sciences (HICSS'06).

[17]  P. Oscar Boykin,et al.  Percolation search in power law networks: making unstructured peer-to-peer networks scalable , 2004, Proceedings. Fourth International Conference on Peer-to-Peer Computing, 2004. Proceedings..

[18]  Taoufik En-Najjary,et al.  A global view of kad , 2007, IMC '07.

[19]  Patrick Valduriez,et al.  Data currency in replicated DHTs , 2007, SIGMOD '07.

[20]  Devavrat Shah,et al.  Computing separable functions via gossip , 2005, PODC '06.

[21]  Karl Aberer,et al.  Updates in highly unreliable, replicated peer-to-peer systems , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..

[22]  R.A. Ferreira,et al.  Randomized Protocols for Duplicate Elimination in Peer-to-Peer Storage Systems , 2005, IEEE Transactions on Parallel and Distributed Systems.