rSearch: Ring-Based Semantic Overlay for Efficient Recall-Guaranteed Search in P2P Networks

Providing recall-guaranteed search is critical for P2P networks. While building semantic overlay improves search performance, existing designs suffer from a tradeoff between search time and search quality (i.e. high recall). Moreover, they require to use high control overhead for overlay maintenance. In this paper, we present rSearch to achieve fast search with guaranteed high recall. The rSearch-enabled overlay topology looks like a ring, augmented with semantic chord links. Given a query, rSearch uses multiple query walkers that traverse on the ring independently to find relevant semantic nodes for answers. The ring structure facilitates fast and low-redundancy query forwarding, while the abundant semantic chord links enable large semantic clusters. Bloom Filter is used to encode and compress node semantic summaries, greatly saving control overhead. rSearch further considers churn resilience and network awareness to enhance system performance. Extensive simulations with real-life file sharing trace and network latency trace show that rSearch greatly outperforms GES.

[1]  Sandhya Dwarkadas,et al.  Peer-to-peer information retrieval using self-organizing semantic overlay networks , 2003, SIGCOMM '03.

[2]  Yiming Hu,et al.  Enhancing Search Performance on Gnutella-Like P2P Systems , 2006, IEEE Transactions on Parallel and Distributed Systems.

[3]  Hector Garcia-Molina,et al.  Semantic Overlay Networks for P2P Systems , 2004, AP2PC.

[4]  Chris Buckley,et al.  Implementation of the SMART Information Retrieval System , 1985 .

[5]  Scott Shenker,et al.  Making gnutella-like P2P systems scalable , 2003, SIGCOMM '03.

[6]  Anne-Marie Kermarrec,et al.  Peer sharing behaviour in the eDonkey network, and implications for the design of server-less file sharing systems , 2006, EuroSys.

[7]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[8]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[9]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[10]  Robert Tappan Morris,et al.  Designing a DHT for Low Latency and High Throughput , 2004, NSDI.

[11]  Gurmeet Singh Manku,et al.  SETS: search enhanced by topic segmentation , 2003, SIGIR.

[12]  Elizabeth R. Jessup,et al.  Matrices, Vector Spaces, and Information Retrieval , 1999, SIAM Rev..

[13]  Edith Cohen,et al.  Search and replication in unstructured peer-to-peer networks , 2002, ICS '02.

[14]  Bruce M. Maggs,et al.  Efficient content location using interest-based locality in peer-to-peer systems , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[15]  Y. Charlie Hu,et al.  Assisted Peer-to-Peer Search with Partial Indexing , 2007, IEEE Transactions on Parallel and Distributed Systems.

[16]  Pascal Felber,et al.  Efficient search in unstructured peer-to-peer networks , 2004, SPAA '04.

[17]  Edith Cohen,et al.  Associative search in peer to peer networks: harnessing latent semantics , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[18]  Anand Sivasubramaniam,et al.  Semantic small world: an overlay network for peer-to-peer search , 2004, Proceedings of the 12th IEEE International Conference on Network Protocols, 2004. ICNP 2004..

[19]  Yunhao Liu,et al.  BloomCast: Efficient Full-Text Retrieval over Unstructured P2Ps with Guaranteed Recall , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[20]  Zhongcheng Li,et al.  Efficient and Scalable Consistency Maintenance for Heterogeneous Peer-to-Peer Systems , 2008, IEEE Transactions on Parallel and Distributed Systems.