论文信息 - Generalized Team Draft Interleaving

Generalized Team Draft Interleaving

Interleaving is an online evaluation method that compares two ranking functions by mixing their results and interpreting the users' click feedback. An important property of an interleaving method is its sensitivity, i.e. the ability to obtain reliable comparison outcomes with few user interactions. Several methods have been proposed so far to improve interleaving sensitivity, which can be roughly divided into two areas: (a) methods that optimize the credit assignment function (how the click feedback is interpreted), and (b) methods that achieve higher sensitivity by controlling the interleaving policy (how often a particular interleaved result page is shown). In this paper, we propose an interleaving framework that generalizes the previously studied interleaving methods in two aspects. First, it achieves a higher sensitivity by performing a joint data-driven optimization of the credit assignment function and the interleaving policy. Second, we formulate the framework to be general w.r.t. the search domain where the interleaving experiment is deployed, so that it can be applied in domains with grid-based presentation, such as image search. In order to simplify the optimization, we additionally introduce a stratified estimate of the experiment outcome. This stratification is also useful on its own, as it reduces the variance of the outcome and thus increases the interleaving sensitivity. We perform an extensive experimental study using large-scale document and image search datasets obtained from a commercial search engine. The experiments show that our proposed framework achieves marked improvements in sensitivity over effective baselines on both datasets.

Craig MacDonald | Iadh Ounis | Eugene Kharitonov | Pavel Serdyukov