论文信息 - Using Centrality to Rank Web Snippets

Using Centrality to Rank Web Snippets

We describe our participation in the WebCLEF 2007 task, targeted at snippet retrieval from web data. Our system ranks snippets based on a simple similarity-based centrality, inspired by the web page ranking algorithms. We experimented with retrieval units (sentences and paragraphs) and with the similarity functions used for centrality computations (word overlap and cosine similarity). We found that using paragraphs with the cosine similarity function shows the best performance with precision around 20% and recall around 25% according to human assessments of the first 7,000 bytes of responses for individual topics.

Valentin Jijkoun | M. de Rijke | Maarten de Rijke | V. Jijkoun

[1] Valentin Jijkoun,et al. Recognizing Textual Entailment: Is Word Similarity Enough? , 2005, MLCW.

[2] M. de Rijke,et al. Fact Discovery in Wikipedia , 2007, IEEE/WIC/ACM International Conference on Web Intelligence (WI'07).

[3] Bing Liu,et al. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[4] James Allan,et al. Retrieval and novelty detection at the sentence level , 2003, SIGIR.

[5] Cordelia Schmid,et al. The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[6] Valentin Jijkoun,et al. Recognizing Textual Entailment: Is Lexical Similarity Enough? , 2006 .