Unbiased Learning to Rank: Theory and Practice

Implicit feedback (e.g., user clicks) is an important source of data for modern search engines. While heavily biased [8, 9, 11, 27], it is cheap to collect and particularly useful for user-centric retrieval applications such as search ranking. To develop an unbiased learning-to-rank system with biased feedback, previous studies have focused on constructing probabilistic graphical models (e.g., click models) with user behavior hypothesis to extract and train ranking systems with unbiased relevance signals. Recently, a novel counterfactual learning framework that estimates and adopts examination propensity for unbiased learning to rank has attracted much attention. Despite its popularity, there is no systematic comparison of the unbiased learning-to-rank frameworks based on counterfactual learning and graphical models. In this tutorial, we aim to provide an overview of the fundamental mechanism for unbiased learning to rank. We will describe the theory behind existing frameworks, and give detailed instructions on how to conduct unbiased learning to rank in practice.

[1]  Yisong Yue,et al.  Beyond position bias: examining result attractiveness as a source of presentation bias in clickthrough data , 2010, WWW '10.

[2]  W. Bruce Croft,et al.  Neural Ranking Models with Weak Supervision , 2017, SIGIR.

[3]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[4]  Thorsten Joachims,et al.  Batch learning from logged bandit feedback through counterfactual risk minimization , 2015, J. Mach. Learn. Res..

[5]  Aniket Kittur,et al.  Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[6]  ChengXiang Zhai,et al.  Content-aware click modeling , 2013, WWW '13.

[7]  Yiqun Liu,et al.  Training Deep Ranking Model with Weak Relevance Labels , 2017, ADC.

[8]  Mark T. Keane,et al.  Modeling Result-List Searching in the World Wide Web: The Role of Relevance Topologies and Trust Bias , 2006 .

[9]  Thorsten Joachims,et al.  Unbiased Learning-to-Rank with Biased Feedback , 2016, WSDM.

[10]  Jiafeng Guo,et al.  Analysis of the Paragraph Vector Model for Information Retrieval , 2016, ICTIR.

[11]  Thorsten Joachims,et al.  Interactively optimizing information retrieval systems as a dueling bandits problem , 2009, ICML '09.

[12]  Erick Cantú-Paz,et al.  Temporal click model for sponsored search , 2010, SIGIR.

[13]  Marc Najork,et al.  Learning to Rank with Selection Bias in Personal Search , 2016, SIGIR.

[14]  Thorsten Joachims,et al.  Learning Socially Optimal Information Systems from Egoistic Users , 2013, ECML/PKDD.

[15]  Olivier Chapelle,et al.  A dynamic bayesian network click model for web search ranking , 2009, WWW '09.

[16]  Yiqun Liu,et al.  Incorporating Non-sequential Behavior into Click Models , 2015, SIGIR.

[17]  Thorsten Joachims,et al.  Counterfactual Risk Minimization: Learning from Logged Bandit Feedback , 2015, ICML.

[18]  Nick Craswell,et al.  Learning to Match using Local and Distributed Representations of Text for Web Search , 2016, WWW.

[19]  Benjamin Piwowarski,et al.  A user browsing model to predict search engine click data from past observations. , 2008, SIGIR '08.

[20]  Thorsten Joachims,et al.  Accurately interpreting clickthrough data as implicit feedback , 2005, SIGIR '05.

[21]  Filip Radlinski,et al.  Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search , 2007, TOIS.

[22]  Filip Radlinski,et al.  Large-scale validation and analysis of interleaved search evaluation , 2012, TOIS.

[23]  W. Bruce Croft,et al.  A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.

[24]  M. de Rijke,et al.  Multileave Gradient Descent for Fast Online Learning to Rank , 2016, WSDM.

[25]  Alon Y. Halevy,et al.  Crowdsourcing systems on the World-Wide Web , 2011, Commun. ACM.

[26]  Marc Najork,et al.  Position Bias Estimation for Unbiased Learning to Rank in Personal Search , 2018, WSDM.

[27]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .