Towards a fully distributed P2P Web search engine

Most centralized Web search engines currently find it harder to catch up with the growth in information needs. Here, we present a fully distributed, collaborative peer-to-peer Web search engine named Coopeer. The goal of the work is to complement centralized search engines to provide more humanized and personalized results by utilizing users' collaboration. Towards this goal, three main ideas are introduced: (a) PeerRank to use cooperation among users for evaluation; (b) a query-based representation to obtain a more humanized description of documents; and (c) a semantic routing algorithm to obtain user-customized results.

[1]  Peter Triantafillou,et al.  SeAl: managing accesses and data in peer-to-peer sharing networks , 2004, Proceedings. Fourth International Conference on Peer-to-Peer Computing, 2004. Proceedings..

[2]  Andrei Z. Broder,et al.  Workshop on Algorithms and Models for the Web Graph , 2007, WAW.

[3]  Pattie Maes,et al.  Social information filtering: algorithms for automating “word of mouth” , 1995, CHI '95.

[4]  Peter Triantafillou,et al.  Towards High Performance Peer-to-Peer Content and Resource Sharing Systems , 2003, CIDR.

[5]  Bradley N. Miller,et al.  GroupLens: applying collaborative filtering to Usenet news , 1997, CACM.

[6]  Gerhard Weikum,et al.  Bookmark-driven Query Routing in Peer-to-Peer Web Search , 2005, Workshop on Peer-to-Peer Information Retrieval.

[7]  Wei-Ying Ma,et al.  Query Expansion by Mining User Logs , 2003, IEEE Trans. Knowl. Data Eng..

[8]  Christian Schindelhauer,et al.  Peer-to-peer networks based on random transformations of connected regular undirected graphs , 2005, SPAA '05.

[9]  Gerhard Weikum,et al.  Improving collection selection with overlap awareness in P2P search engines , 2005, SIGIR '05.

[10]  Gerhard Weikum,et al.  Top-k Query Evaluation with Probabilistic Guarantees , 2004, VLDB.

[11]  Yoav Shoham,et al.  Fab: content-based, collaborative recommendation , 1997, CACM.

[12]  Ellen M. Voorhees,et al.  Overview of TREC 2001 , 2001, TREC.

[13]  King-Lup Liu,et al.  Building efficient and effective metasearch engines , 2002, CSUR.

[14]  Karl Aberer,et al.  Databases, Information Systems, and Peer-to-Peer Computing , 2003, Lecture Notes in Computer Science.

[15]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[16]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[17]  Peter Triantafillou,et al.  AESOP: Altruism-Endowed Self-organizing Peers , 2004, DBISP2P.

[18]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[19]  Norbert Fuhr,et al.  Evaluating different methods of estimating retrieval quality for resource selection , 2003, SIGIR.

[20]  Hector Garcia-Molina,et al.  Semantic Overlay Networks for P2P Systems , 2004, AP2PC.

[21]  Gerhard Weikum,et al.  Learning Word-to-Concept Mappings for Automatic Text Classification , 2005, ICML 2005.

[22]  Bart Selman,et al.  The Hidden Web , 1997, AI Mag..

[23]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[24]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[25]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[26]  Gerhard Weikum,et al.  Efficient and self-tuning incremental query expansion for top-k query processing , 2005, SIGIR '05.

[27]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[28]  Gerhard Weikum,et al.  Query-Log Based Authority Analysis for Web Information Search , 2004, WISE.

[29]  Geoffrey Canright,et al.  Roles in networks , 2004, Sci. Comput. Program..

[30]  David Hales,et al.  From selfish nodes to cooperative networks - emergent link-based incentives in peer-to-peer networks , 2004, Proceedings. Fourth International Conference on Peer-to-Peer Computing, 2004. Proceedings..

[31]  David R. Karger,et al.  Chord: a scalable peer-to-peer lookup protocol for internet applications , 2003, TNET.

[32]  Amy Nicole Langville,et al.  A Survey of Eigenvector Methods for Web Information Retrieval , 2005, SIAM Rev..

[33]  Gerhard Weikum,et al.  JXP: Global Authority Scores in a P2P Network , 2005, WebDB.