Estimating Position Bias without Intrusive Interventions

Presentation bias is one of the key challenges when learning from implicit feedback in search engines, as it confounds the relevance signal. While it was recently shown how counterfactual learning-to-rank (LTR) approaches \citeJoachims/etal/17a can provably overcome presentation bias when observation propensities are known, it remains to show how to effectively estimate these propensities. In this paper, we propose the first method for producing consistent propensity estimates without manual relevance judgments, disruptive interventions, or restrictive relevance modeling assumptions. First, we show how to harvest a specific type of intervention data from historic feedback logs of multiple different ranking functions, and show that this data is sufficient for consistent propensity estimation in the position-based model. Second, we propose a new extremum estimator that makes effective use of this data. In an empirical evaluation, we find that the new estimator provides superior propensity estimates in two real-world systems -- Arxiv Full-text Search and Google Drive Search. Beyond these two points, we find that the method is robust to a wide range of settings in simulation studies.

[1]  Mark T. Keane,et al.  Modeling Result-List Searching in the World Wide Web: The Role of Relevance Topologies and Trust Bias , 2006 .

[2]  M. de Rijke,et al.  Click Models for Web Search , 2015, Click Models for Web Search.

[3]  Filip Radlinski,et al.  Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search , 2007, TOIS.

[4]  Chao Liu,et al.  Click chain model in web search , 2009, WWW '09.

[5]  Marc Najork,et al.  Position Bias Estimation for Unbiased Learning to Rank in Personal Search , 2018, WSDM.

[6]  Lihong Li,et al.  Counterfactual Estimation and Optimization of Click Metrics for Search Engines , 2014, ArXiv.

[7]  D. Rubin,et al.  Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction , 2016 .

[8]  Benjamin Piwowarski,et al.  A user browsing model to predict search engine click data from past observations. , 2008, SIGIR '08.

[9]  Yisong Yue,et al.  Beyond position bias: examining result attractiveness as a source of presentation bias in clickthrough data , 2010, WWW '10.

[10]  Thorsten Joachims,et al.  Accurately Interpreting Clickthrough Data as Implicit Feedback , 2017 .

[11]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[12]  Nick Craswell,et al.  An experimental comparison of click position-bias models , 2008, WSDM '08.

[13]  Thorsten Joachims,et al.  Effective Evaluation Using Logged Bandit Feedback from Multiple Loggers , 2017, KDD.

[14]  Thorsten Joachims,et al.  Batch learning from logged bandit feedback through counterfactual risk minimization , 2015, J. Mach. Learn. Res..

[15]  Olivier Chapelle,et al.  A dynamic bayesian network click model for web search ranking , 2009, WWW '09.

[16]  W. Bruce Croft,et al.  Unbiased Learning to Rank with Unbiased Propensity Estimation , 2018, SIGIR.

[17]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[18]  Filip Radlinski,et al.  Large-scale validation and analysis of interleaved search evaluation , 2012, TOIS.

[19]  Wei Chu,et al.  Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms , 2010, WSDM '11.

[20]  M. de Rijke,et al.  Deep Learning with Logged Bandit Feedback , 2018, ICLR.

[21]  John Langford,et al.  Doubly Robust Policy Evaluation and Learning , 2011, ICML.

[22]  Thorsten Joachims,et al.  The Self-Normalized Estimator for Counterfactual Learning , 2015, NIPS.

[23]  Marc Najork,et al.  Learning to Rank with Selection Bias in Personal Search , 2016, SIGIR.

[24]  Ben Carterette,et al.  Offline Comparative Evaluation with Incremental, Minimally-Invasive Online Feedback , 2018, SIGIR.

[25]  Matthew Richardson,et al.  Predicting clicks: estimating the click-through rate for new ads , 2007, WWW '07.

[26]  John Langford,et al.  Exploration scavenging , 2008, ICML '08.

[27]  Thorsten Joachims,et al.  Counterfactual Learning-to-Rank for Additive Metrics and Deep Models , 2018, ArXiv.

[28]  Thorsten Joachims,et al.  Unbiased Learning-to-Rank with Biased Feedback , 2016, WSDM.

[29]  C. Blumberg Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction , 2016 .

[30]  B. Shepherd,et al.  GUIDO IMBENS, DONALD RUBIN, Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. New York: Cambridge University Press. , 2016, Biometrics.

[31]  Thorsten Joachims,et al.  Recommendations as Treatments: Debiasing Learning and Evaluation , 2016, ICML.

[32]  M. de Rijke,et al.  An Introduction to Click Models for Web Search: SIGIR 2015 Tutorial , 2015, SIGIR.

[33]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .