Efficient and scalable query routing for unstructured peer-to-peer networks

Searching for content in peer-to-peer networks is an interesting and challenging problem. Queries in Gnutella-like unstructured systems that use flooding or random walk to search must visit O(n) nodes in a network of size n, thus consuming significant amounts of bandwidth. In this paper, we propose a query routing protocol that allows low bandwidth consumption during query forwarding using a low cost mechanism to create and maintain information about nearby objects. To achieve this, our protocol maintains a lightweight probabilistic routing table at each node that suggests the location of each object in the network. Following the corresponding routing table entries, a query can reach the destination in a small number of hops with high probability. However, maintaining routing tables in a large and highly dynamic network requires non-traditional mechanisms. We design a novel data structure called an exponentially decaying bloom filter (EDBF) that encodes such probabilistic routing tables in a highly compressed manner, and allows for efficient aggregation and propagation. The search primitives provided by our system can be used to search for single keys or multiple keywords with equal ease. Analytical modeling of our design predicts significant improvements in search efficiency, verified through extensive simulations in which we observed an order of magnitude reduction in query path length over previous proposals.

[1]  John Kubiatowicz,et al.  Probabilistic location and routing , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[2]  Zhichen Xu,et al.  pSearch: information retrieval in structured overlays , 2003, CCRV.

[3]  Dhananjay S. Phatak,et al.  A novel mechanism for data streaming across multiple IP links for improving throughput and reliability in mobile environments , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[4]  Michael Mitzenmacher,et al.  Compressed bloom filters , 2001, PODC '01.

[5]  Ian H. Witten,et al.  Arithmetic coding for data compression , 1987, CACM.

[6]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[7]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[8]  Li Fan,et al.  Web caching and Zipf-like distributions: evidence and implications , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[9]  Richard M. Karp,et al.  A simple algorithm for finding frequent elements in streams and bags , 2003, TODS.

[10]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[11]  Christos Gkantsidis,et al.  Random walks in peer-to-peer networks , 2004, IEEE INFOCOM 2004.

[12]  Zhichen Xu,et al.  pFilter: global information filtering and dissemination using structured overlay networks , 2003, The Ninth IEEE Workshop on Future Trends of Distributed Computing Systems, 2003. FTDCS 2003. Proceedings..

[13]  Abhishek Kumar,et al.  Ulysses: a robust, low-diameter, low-latency peer-to-peer network , 2004, Eur. Trans. Telecommun..

[14]  Christian Huitema,et al.  Routing in the Internet , 1995 .

[15]  Ellen W. Zegura,et al.  Adding Structure to Unstructured Peer-to-Peer Networks: The Role of Overlay Topology , 2003, Networked Group Communication.

[16]  Ben Y. Zhao,et al.  An Infrastructure for Fault-tolerant Wide-area Location and Routing , 2001 .

[17]  Christian Huitema,et al.  An Architecture for Residential Internet Telephony Service , 1999, IEEE Internet Comput..

[18]  Hector Garcia-Molina,et al.  Routing indices for peer-to-peer systems , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[19]  Dimitrios Gunopulos,et al.  A local search mechanism for peer-to-peer networks , 2002, CIKM '02.

[20]  Kyu-Young Whang,et al.  A linear-time probabilistic counting algorithm for database applications , 1990, TODS.

[21]  Stefan Saroiu,et al.  A Measurement Study of Peer-to-Peer File Sharing Systems , 2001 .

[22]  Jeffrey Considine,et al.  Informed content delivery across adaptive overlay networks , 2002, IEEE/ACM Transactions on Networking.

[23]  Dimitrios Tsoumakos,et al.  Adaptive probabilistic search for peer-to-peer networks , 2003, Proceedings Third International Conference on Peer-to-Peer Computing (P2P2003).

[24]  Scott Shenker,et al.  Making gnutella-like P2P systems scalable , 2003, SIGCOMM '03.

[25]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[26]  Peter Druschel,et al.  Pastry: Scalable, distributed object location and routing for large-scale peer-to- , 2001 .