Learning to Rank by Optimizing NDCG Measure

Learning to rank is a relatively new field of study, aiming to learn a ranking function from a set of training data with relevancy labels. The ranking algorithms are often evaluated using information retrieval measures, such as Normalized Discounted Cumulative Gain (NDCG) [1] and Mean Average Precision (MAP) [2]. Until recently, most learning to rank algorithms were not using a loss function related to the above mentioned evaluation measures. The main difficulty in direct optimization of these measures is that they depend on the ranks of documents, not the numerical values output by the ranking function. We propose a probabilistic framework that addresses this challenge by optimizing the expectation of NDCG over all the possible permutations of documents. A relaxation strategy is used to approximate the average of NDCG over the space of permutation, and a bound optimization approach is proposed to make the computation efficient. Extensive experiments show that the proposed algorithm outperforms state-of-the-art ranking algorithms on several benchmark data sets.

[1]  Tao Qin,et al.  Robust sparse rank learning for non-smooth ranking measures , 2009, SIGIR.

[2]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[3]  Rong Jin,et al.  Semi-Supervised Ensemble Ranking , 2008, AAAI.

[4]  Ramesh Nallapati,et al.  Discriminative models for information retrieval , 2004, SIGIR '04.

[5]  Qiang Wu,et al.  McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.

[6]  Tao Qin,et al.  Learning to Search Web Pages with Query-Level Loss Functions , 2006 .

[7]  Hang Li,et al.  Ranking refinement and its application to information retrieval , 2008, WWW.

[8]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[9]  Tao Qin,et al.  LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval , 2007 .

[10]  Thomas Hofmann,et al.  Learning to Rank with Nonsmooth Cost Functions , 2006, NIPS.

[11]  Tao Qin,et al.  FRank: a ranking method with fidelity loss , 2007, SIGIR.

[12]  Klaus Obermayer,et al.  Support vector learning for ordinal regression , 1999 .

[13]  Zoubin Ghahramani,et al.  On the Convergence of Bound Optimization Algorithms , 2002, UAI.

[14]  Tie-Yan Liu,et al.  Adapting ranking SVM to document retrieval , 2006, SIGIR.

[15]  Tie-Yan Liu,et al.  Listwise approach to learning to rank: theory and algorithm , 2008, ICML '08.

[16]  Stephen E. Robertson,et al.  SoftRank: optimizing non-smooth rank metrics , 2008, WSDM '08.

[17]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[18]  Filip Radlinski,et al.  A support vector method for optimizing average precision , 2007, SIGIR.

[19]  Hang Li,et al.  AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[20]  Wei-Pang Yang,et al.  Learning to Rank for Information Retrieval Using Genetic Programming , 2007 .

[21]  Quoc V. Le,et al.  Learning to Rank with Nonsmooth Cost Functions , 2006, Neural Information Processing Systems.

[22]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[23]  Maksims Volkovs,et al.  BoltzRank: learning to maximize expected ranking gain , 2009, ICML '09.