Ranking, Boosting, and Model Adaptation

We present a new ranking algorithm that combines the strengths of two previous methods: boosted tree classification, and LambdaR ank, which has been shown to be empirically optimal for a widely used information retrieval measure. The algorithm is based on boosted regression trees, although the ideas apply to any weak learners, and it is significantly fast er in both train and test phases than the state of the art, for comparable accuracy. We also show how to find the optimal linear combination for any two ran kers, and we use this method to solve the line search problem exactly during boosting. In addition, we show that starting with a previously tra ined model, and boosting using its residuals, furnishes an effective techn ique for model adaptation, and we give results for a particularly pressing prob lem in Web Search - training rankers for markets for which only small amounts of labeled data are available, given a ranker trained on much more data from a larger market.

[1]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[2]  Filip Radlinski,et al.  A support vector method for optimizing average precision , 2007, SIGIR.

[3]  Alexander J. Smola,et al.  Direct Optimization of Ranking Measures , 2007, ArXiv.

[4]  K. Sparck Jones,et al.  A Probabilistic Model of Information Retrieval : Development and Status , 1998 .

[5]  Qiang Wu,et al.  McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.

[6]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[7]  W. Bruce Croft,et al.  A general language model for information retrieval , 1999, CIKM '99.

[8]  Peter L. Bartlett,et al.  Boosting Algorithms as Gradient Descent , 1999, NIPS.

[9]  Brian Roark,et al.  Language Model Adaptation with MAP Estimation and the Perceptron Algorithm , 2004, NAACL.

[10]  J. Bellegarda An Overview of Statistical Language Model Adaptation , 2001 .

[11]  Hang Li,et al.  AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[12]  Yisong Yue,et al.  On Using Simultaneous Perturbation Stochastic Approximation for Learning to Rank, and the Empirical Optimality of LambdaRank , 2007 .

[13]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[14]  Quoc V. Le,et al.  Learning to Rank with Non-Smooth Cost Functions , 2007 .

[15]  John D. Lafferty,et al.  Two-stage language models for information retrieval , 2002, SIGIR '02.

[16]  Stephen E. Robertson,et al.  On rank-based effectiveness measures and optimization , 2007, Information Retrieval.

[17]  Pinar Donmez,et al.  On the Optimality of LambdaRank , 2008 .

[18]  C. Burges,et al.  Learning to Rank Using Classification and Gradient Boosting , 2008 .

[19]  Wei Yuan,et al.  An empirical study on language model adaptation , 2006, TALIP.

[20]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[21]  Hongyuan Zha,et al.  A General Boosting Method and its Application to Learning Ranking Functions for Web Search , 2007, NIPS.

[22]  Christopher J. C. Burges,et al.  Ranking as Learning Structured Outputs , 2005, NIPS 2005.