An Optimization Framework for Weighting Implicit Relevance Labels for Personalized Web Search

Implicit feedback from users of a web search engine is an essential source providing consistent personal relevance labels from the actual population of users. However, previous studies on personalized search employ this source in a rather straightforward manner. Basically, documents that were clicked on get maximal gain, and the rest of the documents are assigned the zero gain. As we demonstrate in our paper, a ranking algorithm trained using these gains directly as the ground truth relevance labels leads to a suboptimal personalized ranking. In this paper we develop a framework for automatic reweighting of these labels. Our approach is based on more subtle aspects of user interaction with the result page. We propose an efficient methodology for deriving confidence levels for relevance labels that relies directly on the objective ranking measure. All our algorithms are evaluated on a large-scale query log provided by a major commercial search engine. The results of the experiments prove that the current state-of-the-art personalization approaches could be significantly improved by enriching relevance grades with weights extracted from post-impression user behavior.

[1]  Martha White,et al.  Optimal reverse prediction: a unified perspective on supervised, unsupervised and semi-supervised learning , 2009, ICML '09.

[2]  Filip Radlinski,et al.  Inferring and using location metadata to personalize web search , 2011, SIGIR.

[3]  Maksims Volkovs,et al.  BoltzRank: learning to maximize expected ranking gain , 2009, ICML '09.

[4]  Xuehua Shen,et al.  Context-sensitive information retrieval using implicit feedback , 2005, SIGIR '05.

[5]  Filip Radlinski,et al.  How does clickthrough data reflect retrieval quality? , 2008, CIKM '08.

[6]  Tie-Yan Liu,et al.  Directly optimizing evaluation measures in learning to rank , 2008, SIGIR.

[7]  Ryen W. White,et al.  Probabilistic models for personalizing web search , 2012, WSDM '12.

[8]  Enhong Chen,et al.  Context-aware ranking in web search , 2010, SIGIR '10.

[9]  Di Jiang,et al.  Context-aware search personalization with concept preference , 2011, CIKM '11.

[10]  ChengXiang Zhai,et al.  Mining long-term search history to improve search accuracy , 2006, KDD '06.

[11]  Susan T. Dumais,et al.  Improving Web Search Ranking by Incorporating User Behavior Information , 2019, SIGIR Forum.

[12]  Filip Radlinski,et al.  Personalizing web search using long term browsing history , 2011, WSDM '11.

[13]  Ji-Rong Wen,et al.  A large-scale evaluation and analysis of personalized search strategies , 2007, WWW '07.

[14]  Yue Gao,et al.  Learning more powerful test statistics for click-based retrieval evaluation , 2010, SIGIR.

[15]  Pavel Serdyukov,et al.  Personalization of web-search using short-term browsing context , 2013, CIKM.

[16]  A. C. Aitken IV.—On Least Squares and Linear Combination of Observations , 1936 .

[17]  J. Friedman Stochastic gradient boosting , 2002 .

[18]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[19]  Ryen W. White,et al.  Understanding web browsing behaviors through Weibull analysis of dwell time , 2010, SIGIR.

[20]  Nick Craswell,et al.  Log-based personalization: the 4th web search click data (WSCD) workshop , 2014, WSDM.

[21]  Ryen W. White,et al.  Personalizing web search results by reading level , 2011, CIKM '11.

[22]  Wei Chu,et al.  Modeling the impact of short- and long-term behavior on search personalization , 2012, SIGIR '12.

[23]  Wei Chu,et al.  Enhancing personalized search by mining and modeling task behavior , 2013, WWW.

[24]  Stephen E. Robertson,et al.  SoftRank: optimizing non-smooth rank metrics , 2008, WSDM '08.

[25]  Ryen W. White,et al.  Predicting short-term interests using activity-based search context , 2010, CIKM.

[26]  Milad Shokouhi,et al.  Fighting search engine amnesia: reranking repeated results , 2013, SIGIR.