论文信息 - A peer-selection algorithm for information retrieval

A peer-selection algorithm for information retrieval

A novel method for creating collection summaries is developed, and a fully decentralized peer-selection algorithm is described. This algorithm finds the most promising peers for answering a given query. Specifically, peers publish per-term synopses of their documents. The synopses of a peer for a given term are divided into score intervals and for each interval, a KMV (K Minimal Values) synopsis of its documents is created. The synopses are used to effectively rank peers by their relevance to a multi-term quer. The proposed approach is verified by experiments on a large real-world dataset. In particular, two collections were created from this dataset, each with a different number of peers. Compared to the state-of-the-art approaches, the proposed method is effective and efficient even when documents are randomly distributed among peers

Yehoshua Sagiv | Yosi Mass | Michal Shmueli-Scheuer

[1] Jaana Kekäläinen,et al. IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[2] Donald E. Knuth,et al. Sorting and Searching , 1973 .

[3] Gerhard Weikum,et al. The MINERVA Project: Database Selection in the Context of P2P Search , 2005, BTW.

[4] Peter J. Haas,et al. On synopses for distinct-value estimation under multiset operations , 2007, SIGMOD '07.

[5] Jamie Callan,et al. DISTRIBUTED INFORMATION RETRIEVAL , 2002 .

[6] Gerhard Weikum,et al. Discovering and exploiting keyword and attribute-value co-occurrences to improve P2P routing indices , 2006, CIKM '06.

[7] Luca Trevisan,et al. Counting Distinct Elements in a Data Stream , 2002, RANDOM.

[8] Yehoshua Sagiv,et al. A scalable and effective full-text search in P2P networks , 2009, CIKM.

[9] Gerhard Weikum,et al. Exploiting correlated keywords to improve approximate information filtering , 2008, SIGIR '08.