Selecting effective expansion terms for diversity

Query expansion has been successfully applied in Information Retrieval, mostly for adhoc search tasks. On the other hand, query expansion can also fail, particularly in light of query ambiguity. For an ambiguous query, an effective strategy is to diversify the search results, in the hope of retrieving at least one relevant result for each of the possible information needs underlying the query. In this paper, we propose to tailor query expansion to diversify the search results in order to tackle query ambiguity. In particular, we introduce a novel approach to select diverse expansion terms given a suitable partition of the feedback provided by the search users. Thorough experiments in the context of the TREC 2009, 2010 and 2011 Web tracks examine the effectiveness of our approach at improving the diversification performance of state-of-the-art query expansion techniques.

[1]  Korris Fu-Lai Chung,et al.  Improving weak ad-hoc queries using wikipedia asexternal corpus , 2007, SIGIR.

[2]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[3]  Stephen E. Robertson,et al.  Selecting good expansion terms for pseudo-relevance feedback , 2008, SIGIR '08.

[4]  Stephen E. Robertson,et al.  GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[5]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[6]  Rodrygo L. T. Santos,et al.  Diversifying for Multiple Information Needs , 2011 .

[7]  Craig MacDonald,et al.  Exploiting query reformulations for web search result diversification , 2010, WWW '10.

[8]  D. Rossetti Poems: HE AND I , 2013 .

[9]  Craig MacDonald,et al.  Learning to rank query suggestions for adhoc and diversity search , 2012, Information Retrieval.

[10]  John D. Lafferty,et al.  Model-based feedback in the language modeling approach to information retrieval , 2001, CIKM '01.

[11]  Olivier Chapelle,et al.  Expected reciprocal rank for graded relevance , 2009, CIKM.

[12]  Wei-Ying Ma,et al.  Query Expansion by Mining User Logs , 2003, IEEE Trans. Knowl. Data Eng..

[13]  Charles L. A. Clarke,et al.  Overview of the TREC 2010 Web Track , 2010, TREC.

[14]  Pu-Jen Cheng,et al.  Selecting Effective Terms for Query Formulation , 2009, AIRS.

[15]  Craig MacDonald,et al.  Expertise drift and query expansion in expert search , 2007, CIKM '07.

[16]  Pushpak Bhattacharyya,et al.  "A term is known by the company it keeps": On Selecting a Good Expansion Set in Pseudo-Relevance Feedback , 2009, ICTIR.

[17]  Charles L. A. Clarke,et al.  Overview of the TREC 2011 Web Track | NIST , 2011 .

[18]  Iadh Ounis,et al.  Finding good feedback documents , 2009, CIKM.

[19]  Sreenivas Gollapudi,et al.  Diversifying search results , 2009, WSDM '09.

[20]  Jun Wang,et al.  Mean-Variance Analysis: A New Document Ranking Theory in Information Retrieval , 2009, ECIR.

[21]  Craig MacDonald,et al.  On the role of novelty for search result diversification , 2011, Information Retrieval.

[22]  John D. Lafferty,et al.  Beyond independent relevance: methods and evaluation metrics for subtopic retrieval , 2003, SIGIR.

[23]  Elad Yom-Tov,et al.  Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval , 2005, SIGIR '05.

[24]  Chris Buckley,et al.  Relevance Feedback Track Overview: TREC 2008 , 2008, TREC.

[25]  Thorsten Joachims,et al.  Supervised clustering with support vector machines , 2005, ICML.

[26]  Charles L. A. Clarke,et al.  Novelty and diversity in information retrieval evaluation , 2008, SIGIR '08.

[27]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[28]  Claudio Carpineto,et al.  Query Difficulty, Robustness, and Selective Application of Query Expansion , 2004, ECIR.

[29]  Amanda Spink,et al.  Real life information retrieval: a study of user queries on the Web , 1998, SIGF.

[30]  Iadh Ounis,et al.  Combining fields for query expansion and adaptive query expansion , 2007, Inf. Process. Manag..

[31]  Iadh Ounis,et al.  Studying Query Expansion Effectiveness , 2009, ECIR.

[32]  Craig MacDonald,et al.  From Puppy to Maturity: Experiences in Developing Terrier , 2012, OSIR@SIGIR.

[33]  Gerard Salton,et al.  Improving retrieval performance by relevance feedback , 1997, J. Am. Soc. Inf. Sci..

[34]  Gianni Amati,et al.  Probability models for information retrieval based on divergence from randomness , 2003 .

[35]  Charles L. A. Clarke,et al.  Overview of the TREC 2011 Web Track , 2011, TREC.

[36]  Mark Sanderson,et al.  Ambiguous queries: test collections need more sense , 2008, SIGIR '08.

[37]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[38]  W. Bruce Croft,et al.  Predicting query performance , 2002, SIGIR '02.