论文信息 - Addressing Trust Bias for Unbiased Learning-to-Rank - 字舞流文

Addressing Trust Bias for Unbiased Learning-to-Rank

Existing unbiased learning-to-rank models use counterfactual inference, notably Inverse Propensity Scoring (IPS), to learn a ranking function from biased click data. They handle the click incompleteness bias, but usually assume that the clicks are noise-free, i.e., a clicked document is always assumed to be relevant. In this paper, we relax this unrealistic assumption and study click noise explicitly in the unbiased learning-to-rank setting. Specifically, we model the noise as the position-dependent trust bias and propose a noise-aware Position-Based Model, named TrustPBM, to better capture user click behavior. We propose an Expectation-Maximization algorithm to estimate both examination and trust bias from click data in TrustPBM. Furthermore, we show that it is difficult to use a pure IPS method to incorporate click noise and thus propose a novel method that combines a Bayes rule application with IPS for unbiased learning-to-rank. We evaluate our proposed methods on three personal search data sets and demonstrate that our proposed model can significantly outperform the existing unbiased learning-to-rank methods.

Michael Bendersky | Marc Najork | Aman Agarwal | Xuanhui Wang | Cheng Li | Xuanhui Wang | Michael Bendersky | Marc Najork | Aman Agarwal | Cheng Li

[1] W. Bruce Croft,et al. Unbiased Learning to Rank with Unbiased Propensity Estimation , 2018, SIGIR.

[2] J. Friedman. Greedy function approximation: A gradient boosting machine. , 2001 .

[3] Yisong Yue,et al. Beyond position bias: examining result attractiveness as a source of presentation bias in clickthrough data , 2010, WWW '10.

[4] Thorsten Joachims,et al. Optimizing search engines using clickthrough data , 2002, KDD.

[5] John Langford,et al. Doubly Robust Policy Evaluation and Learning , 2011, ICML.

[6] Thorsten Joachims,et al. Recommendations as Treatments: Debiasing Learning and Evaluation , 2016, ICML.

[7] Thorsten Joachims,et al. Effective Evaluation Using Logged Bandit Feedback from Multiple Loggers , 2017, KDD.

[8] Thomas Nedelec,et al. Offline A/B Testing for Recommender Systems , 2018, WSDM.

[9] Thorsten Joachims,et al. Batch learning from logged bandit feedback through counterfactual risk minimization , 2015, J. Mach. Learn. Res..

[10] M. de Rijke,et al. Click Models for Web Search , 2015, Click Models for Web Search.

[11] Filip Radlinski,et al. Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search , 2007, TOIS.

[12] Thorsten Joachims,et al. The Self-Normalized Estimator for Counterfactual Learning , 2015, NIPS.

[13] Christopher J. C. Burges,et al. From RankNet to LambdaRank to LambdaMART: An Overview , 2010 .

[14] Cheng Li,et al. The LambdaLoss Framework for Ranking Metric Optimization , 2018, CIKM.

[15] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.

[16] Chao Liu,et al. Click chain model in web search , 2009, WWW '09.

[17] Olivier Chapelle,et al. A dynamic bayesian network click model for web search ranking , 2009, WWW '09.

[18] Mark T. Keane,et al. Modeling user behavior using a search-engine , 2007, IUI '07.

[19] M. J. D. Powell,et al. Weighted Uniform Sampling — a Monte Carlo Technique for Reducing Variance , 1966 .

[20] Benjamin Piwowarski,et al. A user browsing model to predict search engine click data from past observations. , 2008, SIGIR '08.

[21] Thorsten Joachims,et al. Accurately interpreting clickthrough data as implicit feedback , 2005, SIGIR '05.

[22] Ben Carterette,et al. Estimating Clickthrough Bias in the Cascade Model , 2018, CIKM.

[23] Jaana Kekäläinen,et al. Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[24] Marc Najork,et al. Learning to Rank with Selection Bias in Personal Search , 2016, SIGIR.

[25] Thorsten Joachims,et al. Consistent Position Bias Estimation without Online Interventions for Learning-to-Rank , 2018, ArXiv.

[26] Joaquin Quiñonero Candela,et al. Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..

[27] Marc Najork,et al. Position Bias Estimation for Unbiased Learning to Rank in Personal Search , 2018, WSDM.

[28] Lihong Li,et al. Counterfactual Estimation and Optimization of Click Metrics for Search Engines , 2014, ArXiv.

[29] Wei Chu,et al. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms , 2010, WSDM '11.

[30] Nick Craswell,et al. An experimental comparison of click position-bias models , 2008, WSDM '08.

[31] D. Rubin,et al. The central role of the propensity score in observational studies for causal effects , 1983 .

[32] Ben Carterette,et al. Offline Comparative Evaluation with Incremental, Minimally-Invasive Online Feedback , 2018, SIGIR.

[33] Tie-Yan Liu. Learning to Rank for Information Retrieval , 2009, Found. Trends Inf. Retr..

[34] Matthew Richardson,et al. Predicting clicks: estimating the click-through rate for new ads , 2007, WWW '07.