Correcting for Selection Bias in Learning-to-rank Systems

Click data collected by modern recommendation systems are an important source of observational data that can be utilized to train learning-to-rank (LTR) systems. However, these data suffer from a number of biases that can result in poor performance for LTR systems. Recent methods for bias correction in such systems mostly focus on position bias, the fact that higher ranked results (e.g., top search engine results) are more likely to be clicked even if they are not the most relevant results given a user’s query. Less attention has been paid to correcting for selection bias, which occurs because clicked documents are reflective of what documents have been shown to the user in the first place. Here, we propose new counterfactual approaches which adapt Heckman’s two-stage method and accounts for selection and position bias in LTR systems. Our empirical evaluation shows that our proposed methods are much more robust to noise and have better accuracy compared to existing unbiased LTR algorithms, especially when there is moderate to no position bias.

[1]  W. Bruce Croft,et al.  Unbiased Learning to Rank with Unbiased Propensity Estimation , 2018, SIGIR.

[2]  Jin Tian,et al.  Recovering Causal Effects from Selection Bias , 2015, AAAI.

[3]  Thorsten Joachims,et al.  Learning Socially Optimal Information Systems from Egoistic Users , 2013, ECML/PKDD.

[4]  Jin Tian,et al.  Recovering from Selection Bias in Causal and Statistical Inference , 2014, AAAI.

[5]  M. de Rijke,et al.  A Neural Click Model for Web Search , 2016, WWW.

[6]  Marc Najork,et al.  Learning to Rank with Selection Bias in Personal Search , 2016, SIGIR.

[7]  D. Lazer,et al.  The Parable of Google Flu: Traps in Big Data Analysis , 2014, Science.

[8]  Thorsten Joachims,et al.  Accurately interpreting clickthrough data as implicit feedback , 2005, SIGIR '05.

[9]  Barbara E. Engelhardt,et al.  How algorithmic confounding in recommendation systems increases homogeneity and decreases utility , 2017, RecSys.

[10]  Elias Bareinboim,et al.  Controlling Selection Bias in Causal Inference , 2011, AISTATS.

[11]  Elias Bareinboim,et al.  Causal Effect Identification by Adjustment under Confounding and Selection Biases , 2017, AAAI.

[12]  Zoubin Ghahramani,et al.  Probabilistic Matrix Factorization with Non-random Missing Data , 2014, ICML.

[13]  Marc Najork,et al.  Position Bias Estimation for Unbiased Learning to Rank in Personal Search , 2018, WSDM.

[14]  M. de Rijke,et al.  Differentiable Unbiased Online Learning to Rank , 2018, CIKM.

[15]  Filip Radlinski,et al.  Large-scale validation and analysis of interleaved search evaluation , 2012, TOIS.

[16]  Thorsten Joachims,et al.  A General Framework for Counterfactual Learning-to-Rank , 2018, SIGIR.

[17]  M. de Rijke,et al.  Multileave Gradient Descent for Fast Online Learning to Rank , 2016, WSDM.

[18]  Pedro Cano,et al.  From hits to niches?: or how popular artists can bias music recommendation and discovery , 2008, NETFLIX '08.

[19]  M. de Rijke,et al.  Multileaved Comparisons for Fast Online Evaluation , 2014, CIKM.

[20]  Kartik Hosanagar,et al.  Recommender systems and their impact on sales diversity , 2007, EC '07.

[21]  Li Chen,et al.  Human Decision Making and Recommender Systems , 2015, Recommender Systems Handbook.

[22]  Cliff Lampe,et al.  Big Data in Survey Research AAPOR Task Force Report , 2015 .

[23]  J. Heckman Sample selection bias as a specification error , 1979 .

[24]  M. de Rijke,et al.  To Model or to Intervene: A Comparison of Counterfactual and Online Learning to Rank from User Interactions , 2019, SIGIR.

[25]  Bianca Zadrozny,et al.  Learning and evaluating classifiers under sample selection bias , 2004, ICML.

[26]  An Zeng,et al.  Long-Term Effects of Recommendation on the Evolution of Online Systems , 2013 .

[27]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[28]  Matthew Richardson,et al.  Predicting clicks: estimating the click-through rate for new ads , 2007, WWW '07.

[29]  Jin Tian,et al.  Generalized Adjustment Under Confounding and Selection Biases , 2018, AAAI.

[30]  Katja Hofmann,et al.  Reusing historical interaction data for faster online learning to rank for IR , 2013, DIR.

[31]  David Lee,et al.  Biased assimilation, homophily, and the dynamics of polarization , 2012, Proceedings of the National Academy of Sciences.

[32]  Nick Craswell,et al.  An experimental comparison of click position-bias models , 2008, WSDM '08.

[33]  Thorsten Joachims,et al.  Interactively optimizing information retrieval systems as a dueling bandits problem , 2009, ICML '09.

[34]  Olivier Chapelle,et al.  A dynamic bayesian network click model for web search ranking , 2009, WWW '09.

[35]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[36]  Shili Lin,et al.  Rank aggregation methods , 2010 .

[37]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[38]  J. Pearl,et al.  Causal Inference in Statistics: A Primer , 2016 .

[39]  Thorsten Joachims,et al.  Unbiased Learning-to-Rank with Biased Feedback , 2016, WSDM.

[40]  Charles Elkan,et al.  A Bayesian network framework for reject inference , 2004, KDD.

[41]  Elias Bareinboim,et al.  Causal inference and the data-fusion problem , 2016, Proceedings of the National Academy of Sciences.

[42]  Yi Chang,et al.  Yahoo! Learning to Rank Challenge Overview , 2010, Yahoo! Learning to Rank Challenge.

[43]  David M. Blei,et al.  The Deconfounded Recommender: A Causal Inference Approach to Recommendation , 2018, ArXiv.

[44]  Yang Wang,et al.  Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm , 2018, WWW.

[45]  Thorsten Joachims,et al.  Recommendations as Treatments: Debiasing Learning and Evaluation , 2016, ICML.