PlanetP: using gossiping and random replication to support reliable peer-to-peer content search and retrieval

We introduce the PlanetP system, which explores the construction of a reliable peer-to-peer (P2P) content search and retrieval service using randomly circulated global state between peers of an unstructured community. Our work represents a novel alternative approach to recent P2P systems that focus on enabling very largescale name-based object location using sophisticated distributed data structures. We show that our simpler approach scales to several thousand peers (ultimately targeting the regime of about ten thousand) and converges in several minutes using only modest bandwidth while still maintaining reliable search, ranking, and retrieval similar to an Internet search engine. Unlike current search engines or other P2P systems, however, PlanetP does not require centralized directories or management, nor builds a complex distributed data structure. PlanetP achieves its goals using three major components. First, peers collaborate to maintain local copies of the global membership directory along with compact summaries of shared content using a randomized gossiping algorithm. Second, peers implements a per query, text-based ranking algorithm to help users ignore irrelevant documents. Finally, peers collaborate to replicate unpopular content—popular content is naturally highly replicated via hoarding—using ReedSolomon erasure coding to increase the probability of suc-

[1]  Peter Druschel,et al.  Pastry: Scalable, distributed object location and routing for large-scale peer-to- , 2001 .

[2]  Richard P. Martin,et al.  PlanetP: using gossiping to build content addressable peer-to-peer information sharing communities , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[3]  Ben Y. Zhao,et al.  An Infrastructure for Fault-tolerant Wide-area Location and Routing , 2001 .

[4]  Stefan Saroiu,et al.  A Measurement Study of Peer-to-Peer File Sharing Systems , 2001 .

[5]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[6]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[7]  Thu D. Nguyen,et al.  Text-Based Content Search and Retrieval in Ad-hoc P2P Communities , 2002, NETWORKING Workshops.

[8]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[9]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[10]  Ben Y. Zhao,et al.  OceanStore: an architecture for global-scale persistent storage , 2000, SIGP.

[11]  Luigi Rizzo,et al.  Effective erasure codes for reliable computer communication protocols , 1997, CCRV.

[12]  Doug Terry,et al.  Epidemic algorithms for replicated database maintenance , 1988, OPSR.

[13]  Timothy Roscoe,et al.  Transaction-Based Charging in Mnemosyne: A Peer-to-Peer Steganographic Storage System , 2002, NETWORKING Workshops.

[14]  Richard P. Martin,et al.  PlanetP: Infrastructure Support for P2P Information Sharing , 2001 .