Make it fresh, make it quick: searching a network of personal webservers

Personal webservers have proven to be a popular means of sharing files and peer collaboration. Unfortunately, the transient availability and rapidly evolving content on such hosts render centralized, crawl-based search indices stale and incomplete. To address this problem, we propose YouSearch, a distributed search application for personal webservers operating within a shared context (e.g., a corporate intranet). With YouSearch, search results are always fast, fresh and complete -- properties we show arise from an architecture that exploits both the extensive distributed resources available at the peer webservers in addition to a centralized repository of summarized network state. YouSearch extends the concept of a shared context within web communities by enabling peers to aggregate into groups and users to search over specific groups. In this paper, we describe the challenges, design, implementation and experiences with a successful intranet deployment of YouSearch.

[1]  Venkata N. Padmanabhan,et al.  The Case for Cooperative Networking , 2002, IPTPS.

[2]  Roberto J. Bayardo,et al.  YouServ: a web-hosting and content sharing tool for the masses , 2002, WWW '02.

[3]  Peter Druschel,et al.  Pastry: Scalable, distributed object location and routing for large-scale peer-to- , 2001 .

[4]  Krishna Bharat SearchPad: explicit capture of search context to support Web search , 2000, Comput. Networks.

[5]  Antony I. T. Rowstron,et al.  Squirrel: a decentralized peer-to-peer web cache , 2002, PODC '02.

[6]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[7]  Ronald L. Rivest,et al.  The MD5 Message-Digest Algorithm , 1992, RFC.

[8]  Michael Mitzenmacher,et al.  Compressed bloom filters , 2001, PODC '01.

[9]  David Carmel,et al.  Juru at TREC 10 - Experiments with Index Pruning , 2001, TREC.

[10]  Yinglian Xie,et al.  Locality in search engine queries and its implications for caching , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[11]  David R. Karger,et al.  Building peer-to-peer systems with chord, a distributed lookup service , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[12]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[13]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[14]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[15]  Mary Baker,et al.  Peer-to-Peer Caching Schemes to Address Flash Crowds , 2002, IPTPS.

[16]  Dan Rubenstein,et al.  A lightweight, robust P2P system to handle flash crowds , 2002, IEEE Journal on Selected Areas in Communications.

[17]  Thu D. Nguyen,et al.  Text-Based Content Search and Retrieval in Ad-hoc P2P Communities , 2002, NETWORKING Workshops.