论文信息 - What Snippets Say about Pages in Federated Web Search

What Snippets Say about Pages in Federated Web Search

What is the likelihood that a Web page is considered relevant to a query, given the relevance assessment of the corresponding snippet? Using a new federated IR test collection that contains search results from over a hundred search engines on the internet, we are able to investigate such research questions from a global perspective. Our test collection covers the main Web search engines like Google, Yahoo!, and Bing, as well as a number of smaller search engines dedicated to multimedia, shopping, etc., and as such reflects a realistic Web environment. Using a large set of relevance assessments, we are able to investigate the connection between snippet quality and page relevance. The dataset is strongly inhomogeneous, and although the assessors’ consistency is shown to be satisfying, care is required when comparing resources. To this end, a number of probabilistic quantities, based on snippet and page relevance, are introduced and evaluated.

Thomas Demeester | Dong Nguyen | Dolf Trieschnigg | Chris Develder

[1] James Allan,et al. Minimal test collections for retrieval evaluation , 2006, SIGIR.

[2] Mark Sanderson,et al. Quantifying test collection quality based on the consistency of relevance judgements , 2011, SIGIR.

[3] Milad Shokouhi,et al. Federated Search , 2011, Found. Trends Inf. Retr..

[4] Ben Carterette,et al. The effect of assessor error on IR system evaluation , 2010, SIGIR.

[5] Hugh E. Williams,et al. Fast generation of result snippets in web search , 2007, SIGIR.

[6] Ellen M. Voorhees,et al. Variations in relevance judgments and the measurement of retrieval effectiveness , 1998, SIGIR '98.

[7] Charles L. A. Clarke,et al. Overview of the TREC 2010 Web Track , 2010, TREC.

[8] José Luis Vicedo González,et al. TREC: Experiment and evaluation in information retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[9] Jamie Callan,et al. DISTRIBUTED INFORMATION RETRIEVAL , 2002 .

[10] Djoerd Hiemstra,et al. Federated search in the wild: the combined power of over a hundred search engines , 2012, CIKM '12.