Searching Dynamic Communities with Personal Indexes

Often the challenge of finding relevant information is reduced to find the ‘right' people who will answer our question. In this paper we present innovative algorithms called INGA (Interest-based Node Grouping Algorithms) which integrate personal routing indices into semantic query processing to boost performance. Similar to social networks peers in INGA cooperate to efficiently route queries for documents along adaptive shortcut-based overlays using only local, but semantically well chosen information. We propose active and passive shortcut creation strategies for index building and a novel algorithm to select the most promising content providers depending on each peer index with respect to the individual query. We quantify the benefit of our indexing strategy by extensive performance experiments in the SWAP simulation infrastructure. While obtaining high recall values compared to other state-of-the-art algorithms, we show that INGA improves recall and reduces the number of messages significantly.

[1]  Brian F. Cooper Guiding Queries to Information Sources with InfoBeacons , 2004, Middleware.

[2]  Wolf-Tilo Balke,et al.  Progressive distributed top-k retrieval in peer-to-peer networks , 2005, 21st International Conference on Data Engineering (ICDE'05).

[3]  Wolfgang Nejdl,et al.  Super-peer-based routing and clustering strategies for RDF-based peer-to-peer networks , 2003, WWW '03.

[4]  Scott Shenker,et al.  Enhancing P2P File-Sharing with an Internet-Scale Query Processor , 2004, VLDB.

[5]  James Allan,et al.  Incremental relevance feedback for information filtering , 1996, SIGIR '96.

[6]  Hector Garcia-Molina,et al.  Routing indices for peer-to-peer systems , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[7]  Luis Gravano,et al.  Generalizing GlOSS to Vector-Space Databases and Broker Hierarchies , 1995, VLDB.

[8]  Jon M. Kleinberg,et al.  Navigation in a small world , 2000, Nature.

[9]  Steffen Staab,et al.  Remindin': semantic query routing in peer-to-peer networks based on social metaphors , 2004, WWW '04.

[10]  Stefan Saroiu,et al.  A Measurement Study of Peer-to-Peer File Sharing Systems , 2001 .

[11]  Diomidis Spinellis,et al.  A survey of peer-to-peer content distribution technologies , 2004, CSUR.

[12]  Hector Garcia-Molina,et al.  Adaptive peer-to-peer topologies , 2004 .

[13]  David McLean,et al.  An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources , 2003, IEEE Trans. Knowl. Data Eng..

[14]  Bobby Bhattacharjee,et al.  Are Virtualized Overlay Networks Too Much of a Good Thing? , 2002, IPTPS.

[15]  Alfred V. Aho,et al.  Principles of Optimal Page Replacement , 1971, J. ACM.

[16]  Jeffrey M. Bradshaw,et al.  Applying KAoS Services to Ensure Policy Compliance for Semantic Web Services Workflow Composition and Enactment , 2004, SEMWEB.

[17]  Krishna P. Gummadi,et al.  Measuring and analyzing the characteristics of Napster and Gnutella hosts , 2003, Multimedia Systems.

[18]  Karl Aberer,et al.  GridVine: Building Internet-Scale Semantic Overlay Networks , 2004, SEMWEB.

[19]  Bruce M. Maggs,et al.  Efficient content location using interest-based locality in peer-to-peer systems , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[20]  Steffen Staab,et al.  Bibster - A Semantics-Based Bibliographic Peer-to-Peer System , 2004, International Semantic Web Conference.

[21]  Sharon L. Milgram,et al.  The Small World Problem , 1967 .