论文信息 - Selectively diversifying web search results - 字舞流文

Selectively diversifying web search results

Search result diversification is a natural approach for tackling ambiguous queries. Nevertheless, not all queries are equally ambiguous, and hence different queries could benefit from different diversification strategies. A more lenient or more aggressive diversification strategy is typically encoded by existing approaches as a trade-off between promoting relevance or diversity in the search results. In this paper, we propose to learn such a trade-off on a per-query basis. In particular, we examine how the need for diversification can be learnt for each query - given a diversification approach and an unseen query, we predict an effective trade-off between relevance and diversity based on similar previously seen queries. Thorough experiments using the TREC ClueWeb09 collection show that our selective approach can significantly outperform a uniform diversification for both classical and state-of-the-art diversification approaches.

Craig MacDonald | Rodrygo L. T. Santos | Iadh Ounis | I. Ounis | C. Macdonald | Craig Macdonald

[1] Ron Kohavi,et al. Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[2] Craig MacDonald,et al. Learning to Select a Ranking Function , 2010, ECIR.

[3] David R. Karger,et al. Less is More Probabilistic Models for Retrieving Fewer Relevant Documents , 2006 .

[4] Jun Wang,et al. Portfolio theory of information retrieval , 2009, SIGIR.

[5] Giorgio Gambosi,et al. FUB, IASI-CNR and University of Tor Vergata at TREC 2008 Blog Track , 2008, TREC.

[6] Craig MacDonald,et al. Explicit Search Result Diversification through Sub-queries , 2010, ECIR.

[7] Elad Yom-Tov,et al. Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval , 2005, SIGIR '05.

[8] Sreenivas Gollapudi,et al. An axiomatic approach for result diversification , 2009, WWW '09.

[9] D. Kibler,et al. Instance-based learning algorithms , 2004, Machine Learning.

[10] Stephen E. Robertson,et al. Okapi at TREC-3 , 1994, TREC.

[11] Stephen E. Robertson,et al. GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[12] Craig MacDonald,et al. Voting for related entities , 2010, RIAO.

[13] Fabrizio Silvestri,et al. Mining Query Logs: Turning Search Usage Data into Knowledge , 2010, Found. Trends Inf. Retr..

[14] Yong Yu,et al. Identification of ambiguous queries in web search , 2009, Inf. Process. Manag..

[15] Elad Yom-Tov,et al. Estimating the query difficulty for information retrieval , 2010, Synthesis Lectures on Information Concepts, Retrieval, and Services.

[16] Mark Sanderson,et al. Ambiguous queries: test collections need more sense , 2008, SIGIR '08.

[17] Mark Sanderson,et al. Multiple approaches to analysing query diversity , 2009, SIGIR.

[18] Iadh Ounis,et al. Inferring Query Performance Using Pre-retrieval Predictors , 2004, SPIRE.

[19] Ben Carterette,et al. Probabilistic models of ranking novel documents for faceted topic retrieval , 2009, CIKM.

[20] Ian Witten,et al. Data Mining , 2000 .

[21] Tao Qin,et al. Microsoft Research Asia at Web Track and Terabyte Track of TREC 2004 , 2004, TREC.

[22] C. D. Gelatt,et al. Optimization by Simulated Annealing , 1983, Science.

[23] Sreenivas Gollapudi,et al. Diversifying search results , 2009, WSDM '09.

[24] Mark Sanderson,et al. Ambiguous requests: implications for retrieval tests , 2007 .

[25] Stephen M. Omohundro,et al. Five Balltree Construction Algorithms , 2009 .

[26] Craig MacDonald,et al. Exploiting query reformulations for web search result diversification , 2010, WWW '10.

[27] Charles L. A. Clarke,et al. Novelty and diversity in information retrieval evaluation , 2008, SIGIR '08.

[28] David Hawking,et al. Overview of the TREC 2004 Web Track , 2004, TREC.

[29] Ben He,et al. Terrier : A High Performance and Scalable Information Retrieval Platform , 2022 .

[30] W. Bruce Croft,et al. Query performance prediction in web search environments , 2007, SIGIR.

[31] Iadh Ounis,et al. Query performance prediction , 2006, Inf. Syst..

[32] Harry Shum,et al. Query Dependent Ranking Using K-nearest Neighbor * , 2022 .

[33] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[34] Jade Goldstein-Stewart,et al. The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[35] In-Ho Kang,et al. Query type classification for web document retrieval , 2003, SIGIR.

[36] Andrei Broder,et al. A taxonomy of web search , 2002, SIGF.

[37] Eugene Agichtein,et al. Query Ambiguity Revisited: Clickthrough Measures for Distinguishing Informational and Ambiguous Queries , 2010, NAACL.

[38] W. Bruce Croft,et al. Predicting query performance , 2002, SIGIR '02.

[39] Stephen E. Robertson,et al. Ambiguous requests: implications for retrieval tests, systems and theories , 2007, SIGF.