Efficient Peer-to-Peer Keyword Searching

The recent file storage applications built on top of peer-to-peer distributed hash tables lack search capabilities. We believe that search is an important part of any document publication system. To that end, we have designed and analyzed a distributed search engine based on a distributed hash table. Our simulation results predict that our search engine can answer an average query in under one second, using under one kilobyte of bandwidth.

[1]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[2]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[3]  Antony I. T. Rowstron,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001, SOSP.

[4]  Guy M. Lohman,et al.  R* optimizer validation and performance evaluation for local queries , 1986, SIGMOD '86.

[5]  Hector Garcia-Molina,et al.  Efficient search in peer to peer networks , 2004 .

[6]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[7]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[8]  Ian Clarke,et al.  Freenet: A Distributed Anonymous Information Storage and Retrieval System , 2000, Workshop on Design Issues in Anonymity and Unobservability.

[9]  James K. Mullin,et al.  Optimal Semijoins for Distributed Database Systems , 1990, IEEE Trans. Software Eng..

[10]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[11]  H BloomBurton Space/time trade-offs in hash coding with allowable errors , 1970 .

[12]  Philip A. Bernstein,et al.  Using Semi-Joins to Solve Relational Queries , 1981, JACM.

[13]  DruschelPeter,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001 .

[14]  Anna R. Karlin,et al.  Practical network support for IP traceback , 2000, SIGCOMM.

[15]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[16]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[17]  Hector Garcia-Molina,et al.  The Evolution of the Web and Implications for an Incremental Crawler , 2000, VLDB.

[18]  Ian Clarke,et al.  A Distributed Decentralised Information Storage and Retrieval System , 1999 .

[19]  David R. Karger,et al.  Analysis of the evolution of peer-to-peer systems , 2002, PODC '02.

[20]  Stefan Saroiu,et al.  A Measurement Study of Peer-to-Peer File Sharing Systems , 2001 .

[21]  David R. Karger,et al.  Wide-area cooperative storage with CFS , 2001, SOSP.

[22]  Ben Y. Zhao,et al.  An Infrastructure for Fault-tolerant Wide-area Location and Routing , 2001 .