Data indexing in peer-to-peer DHT networks

Peer-to-peer distributed hash table (DHT) systems make it simple to discover specific data when their complete identifiers - or keys - are known in advance. In practice, however, users looking up resources stored in peer-to-peer systems often have only partial information for identifying these resources. We describe techniques for indexing data stored in peer-to-peer DHT networks, and discovering the resources that match a given user query. Our system creates multiple indexes, organized hierarchically, which permit users to locate data even using scarce information, although at the price of a higher lookup cost. The data itself is stored on only one (or few) of the nodes. Experimental evaluation demonstrates the effectiveness of our indexing techniques on a distributed peer-to-peer bibliographic database with realistic user query workloads.

[1]  Steven J. DeRose,et al.  XML Path Language (XPath) , 1999 .

[2]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[3]  Matthias Brosemann,et al.  XML Path Language (XPath) 1.0 — Seminararbeit — , 2004 .

[4]  Ben Y. Zhao,et al.  An Infrastructure for Fault-tolerant Wide-area Location and Routing , 2001 .

[5]  David R. Karger,et al.  On the Feasibility of Peer-to-Peer Web Indexing and Search , 2003, IPTPS.

[6]  Antony I. T. Rowstron,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001, SOSP.

[7]  Divyakant Agrawal,et al.  Approximate Range Selection Queries in Peer-to-Peer Systems , 2003, CIDR.

[8]  David R. Karger,et al.  Wide-area cooperative storage with CFS , 2001, SOSP.

[9]  Anoop Gupta,et al.  Query Processing Over Peer-To-Peer Data Sharing Systems , 2002 .

[10]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[11]  David R. Karger,et al.  INS/Twine: A Scalable Peer-to-Peer Architecture for Intentional Resource Discovery , 2002, Pervasive.

[12]  C. M. Sperberg-McQueen,et al.  Extensible Markup Language (XML) , 1997, World Wide Web J..

[13]  Scott Shenker,et al.  Complex Queries in Dht-based Peer-to-peer Networks , 2002 .

[14]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM 2001.

[15]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[16]  Edith Cohen,et al.  Associative search in peer to peer networks: harnessing latent semantics , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[17]  Hari Balakrishnan,et al.  The design and implementation of an intentional naming system , 1999, SOSP.

[18]  C. M. Sperberg-McQueen,et al.  eXtensible Markup Language (XML) 1.0 (Second Edition) , 2000 .

[19]  Peter Druschel,et al.  Pastry: Scalable, distributed object location and routing for large-scale peer-to- , 2001 .