论文信息 - The University of Amsterdam at WebCLEF 2007: Using Centrality to Rank Web Snippets

The University of Amsterdam at WebCLEF 2007: Using Centrality to Rank Web Snippets

We describe our participation in the WebCLEF 2007 task, targeted at snippet retrieval from web data. Our system ranks snippets based on a simple similarity-based centrality, inspired by the web page ranking algorithms. We experimented with retrieval units (sentences and paragraphs) and with the similarity functions used for centrality computations (word overlap and cosine similarity). We found that using paragraphs with the cosine similarity function shows the best performance with precision around 20% and recall around 25% according to human assessments of the first 7,000 bytes of responses for individual topics.

Valentin Jijkoun | M. de Rijke | Maarten de Rijke | V. Jijkoun

[1] James Allan,et al. Retrieval and novelty detection at the sentence level , 2003, SIGIR.

[2] Valentin Jijkoun,et al. Fact Discovery in Wikipedia , 2007, Web Intelligence.

[3] Bing Liu,et al. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[4] Valentin Jijkoun,et al. Recognizing Textual Entailment: Is Lexical Similarity Enough? , 2006 .