PlanetP: using gossiping to build content addressable peer-to-peer information sharing communities

We introduce PlanetP, content addressable publish/subscribe service for unstructured peer-to-peer (P2P) communities. PlanetP supports content addressing by providing: (1) a gossiping layer used to globally replicate a membership directory and an extremely compact content index; and (2) a completely distributed content search and ranking algorithm that help users find the most relevant information. PlanetP is a simple, yet powerful system for sharing information. PlanetP is simple because each peer must only perform a periodic, randomized, point-to-point message exchange with other peers. PlanetP is powerful because it maintains a globally content-ranked view of the shared data. Using simulation and a prototype implementation, we show that PlanetP achieves ranking accuracy that is comparable to a centralized solution and scales easily to several thousand peers while remaining resilient to rapid membership changes.

[1]  Mor Harchol-Balter,et al.  Resource discovery in distributed networks , 1999, PODC '99.

[2]  Robert Tappan Morris,et al.  Ivy: a read/write peer-to-peer file system , 2002, OSDI '02.

[3]  Stefan Saroiu,et al.  A Measurement Study of Peer-to-Peer File Sharing Systems , 2001 .

[4]  H BloomBurton Space/time trade-offs in hash coding with allowable errors , 1970 .

[5]  Anja Feldmann,et al.  Rate of Change and other Metrics: a Live Study of the World Wide Web , 1997, USENIX Symposium on Internet Technologies and Systems.

[6]  Chris Buckley,et al.  Implementation of the SMART Information Retrieval System , 1985 .

[7]  Anne-Marie Kermarrec,et al.  The many faces of publish/subscribe , 2003, CSUR.

[8]  Indranil Gupta,et al.  Kelips: Building an Efficient and Stable P2P DHT through Increased Memory and Background Overhead , 2003, IPTPS.

[9]  W. Bruce Croft,et al.  Searching distributed collections with inference networks , 1995, SIGIR '95.

[10]  Ian T. Foster,et al.  On Death, Taxes, and the Convergence of Peer-to-Peer and Grid Computing , 2003, IPTPS.

[11]  S. Golomb Run-length encodings. , 1966 .

[12]  Luis Gravano,et al.  The effectiveness of GIOSS for the text database discovery problem , 1994, SIGMOD '94.

[13]  Ben Y. Zhao,et al.  An Infrastructure for Fault-tolerant Wide-area Location and Routing , 2001 .

[14]  Thu D. Nguyen,et al.  Text-Based Content Search and Retrieval in Ad-hoc P2P Communities , 2002, NETWORKING Workshops.

[15]  James C. French,et al.  Comparing the performance of database selection algorithms , 1999, SIGIR '99.

[16]  Amin Vahdat,et al.  Efficient Peer-to-Peer Keyword Searching , 2003, Middleware.

[17]  Pierre Jouvelot,et al.  Semantic file systems , 1991, SOSP '91.

[18]  Peter Druschel,et al.  Pastry: Scalable, distributed object location and routing for large-scale peer-to- , 2001 .

[19]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[20]  Ben Y. Zhao,et al.  OceanStore: an architecture for global-scale persistent storage , 2000, SIGP.

[21]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[22]  Michael B. Jones,et al.  Herald: achieving a global event notification service , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[23]  Ian H. Witten,et al.  Managing Gigabytes: Compressing and Indexing Documents and Images , 1999 .

[24]  David R. Karger,et al.  Analysis of the evolution of peer-to-peer systems , 2002, PODC '02.

[25]  Ugur Çetintemel,et al.  Consistency management in Deno , 2000, Mob. Networks Appl..

[26]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[27]  Hugh E. Williams,et al.  Searchable words on the Web , 2005, International Journal on Digital Libraries.

[28]  Thomas E. Anderson,et al.  A Comparison of File System Workloads , 2000, USENIX Annual Technical Conference, General Track.

[29]  Robbert van Renesse,et al.  Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining , 2003, TOCS.

[30]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[31]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[32]  Richard P. Martin,et al.  PlanetP: Infrastructure Support for P2P Information Sharing , 2001 .

[33]  David Gelernter,et al.  Generative communication in Linda , 1985, TOPL.

[34]  Omprakash D. Gnawali A Keyword-Set Search System for Peer-to-Peer Networks , 2002 .

[35]  Donna K. Harman,et al.  Overview of the first TREC conference , 1993, SIGIR.

[36]  Scott Shenker,et al.  Epidemic algorithms for replicated database maintenance , 1988, OPSR.

[37]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[38]  Marvin Theimer,et al.  The Bayou Architecture: Support for Data Sharing Among Mobile Users , 1994, 1994 First Workshop on Mobile Computing Systems and Applications.