Is top-k sufficient for ranking?

Recently,`top-k learning to rank' has attracted much attention in the community of information retrieval. The motivation comes from the difficulty in obtaining a full-order ranking list for training, when employing reliable pairwise preference judgment. Inspired by the observation that users mainly care about top ranked search result, top-k learning to rank proposes to utilize top-k ground-truth for training, where only the total order of top k items are provided, instead of a full-order ranking list. However, it is not clear whether the underlying assumption holds, i.e. top-k ground-truth is sufficient for training. In this paper, we propose to study this problem from both empirical and theoretical aspects. Empirically, our experimental results on benchmark datasets LETOR4.0 show that the test performances of both pairwise and listwise ranking algorithms will quickly increase to a stable value, with the growth of k in the top-k ground-truth. Theoretically, we prove that the losses of these typical ranking algorithms in top-k setting are tighter upper bounds of (1--NDCG@k), compared with that in full-order setting. Therefore, our studies reveal that learning on top-k ground-truth is surely sufficient for ranking, which lay a foundation for the new learning to rank framework.

[1]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[2]  Ben Carterette,et al.  Evaluation measures for preference judgments , 2008, SIGIR '08.

[3]  David Maxwell Chickering,et al.  Here or there: preference judgments for relevance , 2008 .

[4]  Tie-Yan Liu,et al.  Listwise approach to learning to rank: theory and algorithm , 2008, ICML '08.

[5]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[6]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[7]  Tie-Yan Liu,et al.  Statistical Consistency of Top-k Ranking , 2009, NIPS.

[8]  Tie-Yan Liu,et al.  Ranking Measures and Loss Functions in Learning to Rank , 2009, NIPS.

[9]  Ellen M. Voorhees,et al.  Retrieval System Evaluation , 2005 .

[10]  Hang Li,et al.  AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[11]  Xueqi Cheng,et al.  A new probabilistic model for top-k ranking problem , 2012, CIKM.

[12]  Tie-Yan Liu,et al.  Statistical Consistency of Ranking Methods in A Rank-Differentiable Probability Space , 2012, NIPS.

[13]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[14]  Yong Yu,et al.  Select-the-Best-Ones: A new way to judge relative relevance , 2011, Inf. Process. Manag..

[15]  Tao Qin,et al.  Query-level loss functions for information retrieval , 2008, Inf. Process. Manag..

[16]  Robert Burgin Variations in Relevance Judgments and the Evaluation of Retrieval Performance , 1992, Inf. Process. Manag..

[17]  Qiang Wu,et al.  McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.

[18]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[19]  R. Duncan Luce,et al.  Individual Choice Behavior , 1959 .

[20]  Olivier Chapelle,et al.  Expected reciprocal rank for graded relevance , 2009, CIKM.

[21]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[22]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[23]  Filip Radlinski,et al.  Query chains: learning to rank from implicit feedback , 2005, KDD '05.

[24]  Xueqi Cheng,et al.  Top-k learning to rank: labeling, ranking and evaluation , 2012, SIGIR '12.