论文信息 - On caption bias in interleaving experiments

On caption bias in interleaving experiments

Information retrieval evaluation most often involves manually assessing the relevance of particular query-document pairs. In cases where this is difficult (such as personalized search), interleaved comparison methods are becoming increasingly common. These methods compare pairs of ranking functions based on user clicks on search results, thus better reflecting true user preferences. However, by depending on clicks, there is a potential for bias. For example, users have been previously shown to be more likely to click on results with attractive titles and snippets. An interleaving evaluation where one ranker tends to generate results that attract more clicks (without being more relevant) may thus be biased. We present an approach for detecting and compensating for this type of bias in interleaving evaluations. Introducing a new model of caption bias, we propose features that model bias based on (1) per-document effects, and (2) the (pairwise) relationships between a document and surrounding documents. We show that our model can effectively capture click behavior, with best results achieved by a model that combines both per-document and pairwise features. Applying this model to re-weight observed user clicks, we find a small overall effect on real interleaving comparisons, but also identify a case where initially detected preferences vanish after caption bias re-weighting is applied. Our results indicate that our model of caption bias is effective and can successfully identify interleaving experiments affected by caption bias.

Filip Radlinski | Katja Hofmann | Fritz Behr

[1] Mark Sanderson,et al. Test Collection Based Evaluation of Information Retrieval Systems , 2010, Found. Trends Inf. Retr..

[2] Filip Radlinski,et al. Large-scale validation and analysis of interleaved search evaluation , 2012, TOIS.

[3] Christos Faloutsos,et al. Tailoring click models to user goals , 2009, WSCD '09.

[4] Zheng Chen,et al. A novel click model and its applications to online advertising , 2010, WSDM '10.

[5] Benjamin Piwowarski,et al. A user browsing model to predict search engine click data from past observations. , 2008, SIGIR '08.

[6] Chao Liu,et al. Efficient multiple-click models in web search , 2009, WSDM '09.

[7] Filip Radlinski,et al. Comparing the sensitivity of information retrieval metrics , 2010, SIGIR.

[8] Thorsten Joachims,et al. Eye-tracking analysis of user behavior in WWW search , 2004, SIGIR '04.

[9] Jaime Teevan,et al. Implicit feedback for inferring user preference: a bibliography , 2003, SIGF.

[10] Andrew Trotman,et al. Comparative analysis of clicks and judgments for IR evaluation , 2009, WSCD '09.

[11] Edward Cutrell,et al. What are you looking for?: an eye-tracking study of information usage in web search , 2007, CHI.