University of Delaware at Diverstiy Task of Web Track 2010

We report our systems and experiments in the diversity task of TREC 2010 Web track. Our goal is to evaluate the e ectiveness of the proposed methods for search result diversification on the large data collection. In the diversification systems, we use the greedy algorithm to select the document with the highest diversity score on each position and return a re-ranked list of diversified documents based on the query subtopics. The system extracts di erent groups of semantically related terms from the original retrieved documents as the subtopics of the query. It then uses the proposed diversity retrieval functions to compute the diversity score of each document on a particular position based on the similarity between the document and each subtopic, the relevance score of the subtopic given the query and the novelty of the subtopic given the previously selected documents.

[1]  Thorsten Joachims,et al.  Predicting diverse subsets using structural SVMs , 2008, ICML '08.

[2]  Ben Carterette,et al.  Probabilistic Models of Novel Document Rankings for Faceted Topic Retrieval , 2009 .

[3]  Craig MacDonald,et al.  University of Glasgow at TREC 2009: Experiments with Terrier , 2009, TREC.

[4]  Craig MacDonald,et al.  Exploiting query reformulations for web search result diversification , 2010, WWW '10.

[5]  Sreenivas Gollapudi,et al.  An axiomatic approach for result diversification , 2009, WWW '09.

[6]  Charles L. A. Clarke,et al.  Novelty and diversity in information retrieval evaluation , 2008, SIGIR '08.

[7]  Yiqun Liu,et al.  THUIR at TREC 2009 Web Track: Finding Relevant and Diverse Results for Large Scale Web Search , 2009, TREC.

[8]  Yue Liu,et al.  ICTNET at Web Track 2010 Diversity Task , 2010, TREC.

[9]  Charles L. A. Clarke,et al.  Efficient and effective spam filtering and re-ranking for large web datasets , 2010, Information Retrieval.

[10]  ChengXiang Zhai,et al.  Semantic term matching in axiomatic approaches to information retrieval , 2006, SIGIR.

[11]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[12]  Sreenivas Gollapudi,et al.  Diversifying search results , 2009, WSDM '09.

[13]  Stephen E. Robertson,et al.  Microsoft Research at TREC 2009: Web and Relevance Feedback Track , 2009, TREC.

[14]  Shuming Shi,et al.  Microsoft Research Asia at the Web Track of TREC 2009 , 2009, TREC.

[15]  David R. Karger,et al.  Less is More Probabilistic Models for Retrieving Fewer Relevant Documents , 2006 .

[16]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[17]  Brian D. Davison,et al.  Diversifying Search Results with Popular Subtopics , 2009, TREC.

[18]  Wei-Ying Ma,et al.  Learning to cluster web search results , 2004, SIGIR '04.

[19]  Filip Radlinski,et al.  Improving personalized web search using result diversification , 2006, SIGIR.