User effort minimization through adaptive diversification

Ambiguous queries, which are typical on search engines and recommendation systems, often return a large number of results from multiple interpretations. Given that many users often perform their searches on limited size screens (e.g. mobile phones), an important problem is which results to display first. Recent work has suggested displaying a set of results (Top-k) based on their relevance score with respect to the query and their diversity with respect to each other. However, previous works balance relevance and diversity mostly by a predefined fixed way. In this paper, we show that for different search tasks there is a different ideal balance of relevance and diversity. We propose a principled method for adaptive diversification of query results that minimizes the user effort to find the desired results, by dynamically balancing the relevance and diversity at each query step (e.g. when refining the query or viewing the next page of results). We introduce a navigation cost model as a means to estimate the effort required to navigate the query-results, and show that the problem of estimating the ideal amount of diversification at each step is NP-Hard. We propose an efficient approximate algorithm to select a near-optimal subset of the query results that minimizes the expected user effort. Finally we demonstrate the efficacy and efficiency of our solution in minimizing user effort, compared to state-of-the-art ranking methods, by means of an extensive experimental evaluation and a comprehensive user study on Amazon Mechanical Turk.

[1]  Jingrui He,et al.  Diversified ranking on large graphs: an optimization viewpoint , 2011, KDD.

[2]  Craig MacDonald,et al.  Selectively diversifying web search results , 2010, CIKM.

[3]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[4]  Divesh Srivastava,et al.  On query result diversification , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[5]  Mukesh K. Mohania,et al.  Retrieval]: Query formulation, search process , 2022 .

[6]  Sean M. McNee,et al.  Improving recommendation lists through topic diversification , 2005, WWW '05.

[7]  Thorsten Joachims,et al.  Online learning to diversify from implicit feedback , 2012, KDD.

[8]  Evimaria Terzi,et al.  Highlighting Diverse Concepts in Documents , 2009, SDM.

[9]  Jeffrey Xu Yu,et al.  Diversifying Top-K Results , 2012, Proc. VLDB Endow..

[10]  Tova Milo,et al.  Diversification and refinement in collaborative filtering recommender , 2011, CIKM '11.

[11]  Filip Radlinski,et al.  Redundancy, diversity and interdependent document relevance , 2009, SIGF.

[12]  Thorsten Joachims,et al.  Dynamic ranked retrieval , 2011, WSDM '11.

[13]  Thorsten Joachims,et al.  Predicting diverse subsets using structural SVMs , 2008, ICML '08.

[14]  Cong Yu,et al.  It takes variety to make a world: diversification in recommender systems , 2009, EDBT '09.

[15]  Charles L. A. Clarke,et al.  Novelty and diversity in information retrieval evaluation , 2008, SIGIR '08.

[16]  Peter Fankhauser,et al.  DivQ: diversification for keyword search over structured databases , 2010, SIGIR.

[17]  Gerhard Weikum,et al.  Probabilistic Ranking of Database Query Results , 2004, VLDB.

[18]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[19]  Sreenivas Gollapudi,et al.  Diversifying search results , 2009, WSDM '09.

[20]  Eamonn J. Keogh,et al.  Diversifying query results on semi-structured data , 2012, CIKM '12.

[21]  Evaggelia Pitoura,et al.  DisC diversity: result diversification based on dissimilarity and coverage , 2012, Proc. VLDB Endow..

[22]  Sihem Amer-Yahia,et al.  Efficient Computation of Diverse Query Results , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[23]  Seung-won Hwang,et al.  Automatic categorization of query results , 2004, SIGMOD '04.

[24]  Vassilis J. Tsotras,et al.  Distributed Diversification of Large Datasets , 2014, 2014 IEEE International Conference on Cloud Engineering.

[25]  Vagelis Hristidis,et al.  FACeTOR: cost-driven exploration of faceted query results , 2010, CIKM.