PeerSearch: Efficient Information Retrieval in Peer-to-Peer Networks

In this paper, we propose an efficient peer-to-peer information retrieval system PeerSearch that supports state-of-the-art content and semantic searches. PeerSearch avoids the scalability problem of existing systems that employ centralized indexing, index flooding, or query flooding. It also avoids the non-determinism that exhibited by heuristic-based approaches. PeerSearch achieves both efficiency and determinism through an elegant combination of index placement and query routing. Given a query, PeerSearch only needs to search a small number of nodes to identify matching documents.

[1]  Steven R. Waterhouse Jxta search:distributed search for distributed networks , 2001 .

[2]  Thu D. Nguyen,et al.  Text-Based Content Search and Retrieval in Ad-hoc P2P Communities , 2002, NETWORKING Workshops.

[3]  Michael F. Schwartz,et al.  A Scalable, Non-Hierarchical Resource Discovery Mechanism Based on Probabilistic Protocols† , 1990 .

[4]  Magnus Karlsson,et al.  Turning heterogeneity into an advantage in overlay routing , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[5]  Hector Garcia-Molina,et al.  Routing indices for peer-to-peer systems , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[6]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[7]  Edith Cohen,et al.  Search and replication in unstructured peer-to-peer networks , 2002, ICS '02.

[8]  Elizabeth R. Jessup,et al.  Matrices, Vector Spaces, and Information Retrieval , 1999, SIAM Rev..

[9]  Luis Gravano,et al.  GlOSS: text-source discovery over the Internet , 1999, TODS.

[10]  John Kubiatowicz,et al.  Probabilistic location and routing , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.