Using Centrality to Rank Web Snippets

We describe our participation in the WebCLEF 2007 task, targeted at snippet retrieval from web data. Our system ranks snippets based on a simple similarity-based centrality, inspired by the web page ranking algorithms. We experimented with retrieval units (sentences and paragraphs) and with the similarity functions used for centrality computations (word overlap and cosine similarity). We found that using paragraphs with the cosine similarity function shows the best performance with precision around 20% and recall around 25% according to human assessments of the first 7,000 bytes of responses for individual topics.