"Learning to rank for information retrieval from user interactions" by K. Hofmann, S. Whiteson, A. Schuth, and M. de Rijke with Martin Vesely as coordinator

In this article we give an overview of our recent work on online learning to rank for information retrieval (IR). This work addresses IR from a reinforcement learning (RL) point of view, with the aim to enable systems that can learn directly from interactions with their users. Learning directly from user interactions is difficult for several reasons. First, user interactions are hard to interpret as feedback for learning because it is usually biased and noisy. Second, the system can only observe feedback on actions (e.g., rankers, documents) actually shown to users, which results in an exploration-exploitation challenge. Third, the amount of feedback and therefore the quality of learning is limited by the number of user interactions, so it is important to use the observed data as effectively as possible. Here, we discuss our work on interpreting user feedback using probabilistic interleaved comparisons, and on learning to rank from noisy, relative feedback.

[1]  Katja Hofmann,et al.  Balancing Exploration and Exploitation in Learning to Rank Online , 2011, ECIR.

[2]  Katja Hofmann,et al.  Fidelity, Soundness, and Efficiency of Interleaved Comparison Methods , 2013, TOIS.

[3]  Katja Hofmann,et al.  Reusing historical interaction data for faster online learning to rank for IR , 2013, DIR.

[4]  Thorsten Joachims,et al.  Interactively optimizing information retrieval systems as a dueling bandits problem , 2009, ICML '09.

[5]  Filip Radlinski,et al.  Large-scale validation and analysis of interleaved search evaluation , 2012, TOIS.

[6]  Katja Hofmann,et al.  Lerot: an online learning to rank framework , 2013, LivingLab '13.

[7]  Filip Radlinski,et al.  On caption bias in interleaving experiments , 2012, CIKM '12.

[8]  Thorsten Joachims,et al.  Evaluating Retrieval Performance Using Clickthrough Data , 2003, Text Mining.

[9]  Katja Hofmann,et al.  Information Retrieval manuscript No. (will be inserted by the editor) Balancing Exploration and Exploitation in Listwise and Pairwise Online Learning to Rank for Information Retrieval , 2022 .

[10]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[11]  Filip Radlinski,et al.  How does clickthrough data reflect retrieval quality? , 2008, CIKM '08.

[12]  ChengXiang Zhai,et al.  Evaluation of methods for relative comparison of retrieval systems based on clickthroughs , 2009, CIKM.

[13]  Katja Hofmann,et al.  A probabilistic method for inferring preferences from clicks , 2011, CIKM '11.

[14]  Weiss,et al.  Text Mining , 2010 .

[15]  Katja Hofmann,et al.  Estimating interleaved comparison outcomes from historical click data , 2012, CIKM '12.