Online Exploration for Detecting Shifts in Fresh Intent

In web search, recency ranking refers to the task of ranking documents while taking into account freshness as one of the criteria of their relevance. There are two approaches to recency ranking. One focuses on extending existing learning to rank algorithms to optimize for both freshness and relevance. The other relies on an aggregated search strategy: a (dedicated) fresh vertical is used and fresh results from this vertical are subsequently integrated into the search engine result page. In this paper, we adopt the second strategy. In particular, we focus on the fresh vertical prediction task for repeating queries and identify the following novel algorithmic problem: how to quickly correct fresh intent detection mistakes made by a state-of-the-art fresh intent detector, which erroneously detected or missed a fresh intent shift upwards for a particular repeating query (i.e., a change in the degree to which the query has a fresh intent). We propose a method for solving this problem. We use online exploration at the early start of what we believe to be a detected intent shift. Based on this exploratory phase, we correct fresh intent detection mistakes made by a state-of-that-art fresh intent detector for queries, whose fresh intent has shifted. Using query logs of Yandex, we demonstrate that our methods allow us to significantly improve the speed and quality of the detection of fresh intent shifts.

[1]  Fernando Diaz,et al.  Time is of the essence: improving recency ranking using Twitter data , 2010, WWW '10.

[2]  Gilad Mishne,et al.  Towards recency ranking in web search , 2010, WSDM '10.

[3]  Tapas Kanungo,et al.  Model characterization curves for federated search using click-logs: predicting user engagement metrics for the span of feasible operating points , 2011, WWW.

[4]  Jun Wang,et al.  Iterative Expectation for Multi Period Information Retrieval , 2013, ArXiv.

[5]  Katja Hofmann,et al.  Information Retrieval manuscript No. (will be inserted by the editor) Balancing Exploration and Exploitation in Listwise and Pairwise Online Learning to Rank for Information Retrieval , 2022 .

[6]  Fernando Diaz,et al.  Adaptation of offline vertical selection predictions in the presence of user feedback , 2009, SIGIR.

[7]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[8]  Maarten de Rijke,et al.  Blending Vertical and Web Results - A Case Study Using Video Intent , 2014, ECIR.

[9]  Pavel Serdyukov,et al.  Recency ranking by diversification of result set , 2011, CIKM '11.

[10]  Mounia Lalmas,et al.  Aggregated Search , 2011, Advanced Topics in Information Retrieval.

[11]  Fernando Diaz,et al.  Integration of news content into web results , 2009, WSDM '09.

[12]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[13]  Katja Hofmann,et al.  Balancing Exploration and Exploitation in Learning to Rank Online , 2011, ECIR.

[14]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[15]  Chao Liu,et al.  Efficient multiple-click models in web search , 2009, WSDM '09.

[16]  Fernando Diaz,et al.  Learning to aggregate vertical results into web search results , 2011, CIKM '11.

[17]  Nicolò Cesa-Bianchi,et al.  Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[18]  Fernando Diaz,et al.  Classification-based resource selection , 2009, CIKM.

[19]  Lihong Li,et al.  An Empirical Evaluation of Thompson Sampling , 2011, NIPS.

[20]  Katja Hofmann,et al.  A probabilistic method for inferring preferences from clicks , 2011, CIKM '11.

[21]  Brian D. Davison,et al.  Learning to rank for freshness and relevance , 2011, SIGIR.

[22]  Wei Chu,et al.  Refining Recency Search Results with User Click Feedback , 2011, ArXiv.

[23]  Qiang Wu,et al.  Click-through prediction for news queries , 2009, SIGIR.

[24]  Yi Chang,et al.  A unified search federation system based on online user feedback , 2013, KDD.

[25]  Norbert Fuhr,et al.  From Retrieval Status Values to Probabilities of Relevance for Advanced IR Applications , 2004, Information Retrieval.

[26]  Fernando Diaz,et al.  Sources of evidence for vertical selection , 2009, SIGIR.

[27]  Tapas Kanungo,et al.  On composition of a federated web search result page: using online users to provide pairwise preference for heterogeneous verticals , 2011, WSDM '11.

[28]  Tao Qin,et al.  Ranking with query-dependent loss for web search , 2010, WSDM '10.

[29]  Milad Shokouhi,et al.  Behavioral dynamics on the web: Learning, modeling, and prediction , 2013, TOIS.