Position Bias Estimation for Unbiased Learning to Rank in Personal Search

A well-known challenge in learning from click data is its inherent bias and most notably position bias. Traditional click models aim to extract the ‹query, document› relevance and the estimated bias is usually discarded after relevance is extracted. In contrast, the most recent work on unbiased learning-to-rank can effectively leverage the bias and thus focuses on estimating bias rather than relevance [20, 31]. Existing approaches use search result randomization over a small percentage of production traffic to estimate the position bias. This is not desired because result randomization can negatively impact users' search experience. In this paper, we compare different schemes for result randomization (i.e., RandTopN and RandPair) and show their negative effect in personal search. Then we study how to infer such bias from regular click data without relying on randomization. We propose a regression-based Expectation-Maximization (EM) algorithm that is based on a position bias click model and that can handle highly sparse clicks in personal search. We evaluate our EM algorithm and the extracted bias in the learning-to-rank setting. Our results show that it is promising to extract position bias from regular clicks without result randomization. The extracted bias can improve the learning-to-rank algorithms significantly. In addition, we compare the pointwise and pairwise learning-to-rank models. Our results show that pairwise models are more effective in leveraging the estimated bias.

[1]  Nick Craswell,et al.  An experimental comparison of click position-bias models , 2008, WSDM '08.

[2]  Marc Najork,et al.  Learning to Rank with Selection Bias in Personal Search , 2016, SIGIR.

[3]  Qiang Yang,et al.  A Whole Page Click Model to Better Interpret Search Engine Click Data , 2011, AAAI.

[4]  Martin Hacker,et al.  Understanding re-finding behavior in naturalistic email interaction logs , 2011, SIGIR '11.

[5]  Zheng Chen,et al.  A novel click model and its applications to online advertising , 2010, WSDM '10.

[6]  Christopher J. C. Burges,et al.  From RankNet to LambdaRank to LambdaMART: An Overview , 2010 .

[7]  Ya Xu,et al.  Computers and iphones and mobile phones, oh my!: a logs-based comparison of search users on different devices , 2009, WWW '09.

[8]  Lihong Li,et al.  Counterfactual Estimation and Optimization of Click Metrics in Search Engines: A Case Study , 2015, WWW.

[9]  Olivier Chapelle,et al.  A dynamic bayesian network click model for web search ranking , 2009, WWW '09.

[10]  Marc Najork,et al.  Learning from User Interactions in Personal Search via Attribute Parameterization , 2017, WSDM.

[11]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[12]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[13]  Yisong Yue,et al.  Beyond position bias: examining result attractiveness as a source of presentation bias in clickthrough data , 2010, WWW '10.

[14]  Susan T. Dumais,et al.  Characterizing Email Search using Large-scale Behavioral Logs and Surveys , 2017, WWW.

[15]  David Carmel,et al.  Promoting Relevant Results in Time-Ranked Mail Search , 2017, WWW.

[16]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[17]  Chao Liu,et al.  Click chain model in web search , 2009, WWW '09.

[18]  Matthew Richardson,et al.  Predicting clicks: estimating the click-through rate for new ads , 2007, WWW '07.

[19]  Thorsten Joachims,et al.  Recommendations as Treatments: Debiasing Learning and Evaluation , 2016, ICML.

[20]  M. de Rijke,et al.  An Introduction to Click Models for Web Search: SIGIR 2015 Tutorial , 2015, SIGIR.

[21]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[22]  JoachimsThorsten,et al.  Batch learning from logged bandit feedback through counterfactual risk minimization , 2015 .

[23]  Susan T. Dumais,et al.  Stuff I've Seen: A System for Personal Information Retrieval and Re-Use , 2003, SIGF.

[24]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[25]  Mark T. Keane,et al.  Modeling Result-List Searching in the World Wide Web: The Role of Relevance Topologies and Trust Bias , 2006 .

[26]  Hamed Zamani,et al.  Situational Context for Ranking in Personal Search , 2017, WWW.

[27]  John Langford,et al.  Doubly Robust Policy Evaluation and Learning , 2011, ICML.

[28]  Thorsten Joachims,et al.  Effective Evaluation Using Logged Bandit Feedback from Multiple Loggers , 2017, KDD.

[29]  Thorsten Joachims,et al.  Batch learning from logged bandit feedback through counterfactual risk minimization , 2015, J. Mach. Learn. Res..

[30]  Benjamin Piwowarski,et al.  A user browsing model to predict search engine click data from past observations. , 2008, SIGIR '08.

[31]  Thorsten Joachims,et al.  Accurately interpreting clickthrough data as implicit feedback , 2005, SIGIR '05.

[32]  Filip Radlinski,et al.  Understanding and Modeling Success in Email Search , 2017, SIGIR.

[33]  Wei Chu,et al.  Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms , 2010, WSDM '11.

[34]  M. de Rijke,et al.  Click Models for Web Search , 2015, Click Models for Web Search.

[35]  David Carmel,et al.  Rank by Time or by Relevance?: Revisiting Email Search , 2015, CIKM.