pNear: combining Content Clustering and Distributed Hash Tables

Full-text search is a challenging problem in Peer-to-Peer (P2P) systems. Currently two promising directions to solve this problem are (1) distributed indexes like hash-tables (DHTs) and (2) semantic overlay networks (SONs) which can be divided into systems that cluster peers with similar content based on term overlap and systems that map both the content and queries on a shared semantic data structure. In this paper we present the pNear system that combines DHTs with clustering via term overlap and show that we are able to tackle some important disadvantages that hold for the individual approaches. We evaluate our approach via simulations based on a large and realistic data-set that we have constructed for this purpose, and which will be useful for similar experiments by others.

[1]  Steffen Staab,et al.  Bibster - A Semantics-Based Bibliographic Peer-to-Peer System , 2004, International Semantic Web Conference.

[2]  Tore Risch,et al.  EDUTELLA: a P2P networking infrastructure based on RDF , 2002, WWW.

[3]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[4]  Peter Druschel,et al.  Pastry: Scalable, distributed object location and routing for large-scale peer-to- , 2001 .

[5]  Steffen Staab,et al.  Ontology Learning for the Semantic Web , 2002, IEEE Intell. Syst..

[6]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[7]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[8]  Anne-Marie Kermarrec,et al.  Exploiting semantic proximity in peer-to-peer content searching , 2004, Proceedings. 10th IEEE International Workshop on Future Trends of Distributed Computing Systems, 2004. FTDCS 2004..

[9]  Karl Aberer,et al.  P-Grid: a self-organizing structured P2P system , 2003, SGMD.

[10]  Frank van Harmelen,et al.  Peer Selection in Peer-to-Peer Networks with Semantic Topologies , 2004, ICSNW.

[11]  Sandhya Dwarkadas,et al.  Peer-to-peer information retrieval using self-organizing semantic overlay networks , 2003, SIGCOMM '03.

[12]  Karl Aberer,et al.  GridVine: Building Internet-Scale Semantic Overlay Networks , 2004, SEMWEB.

[13]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[14]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .