论文信息 - Rankboost+: an improvement to Rankboost

Rankboost+: an improvement to Rankboost

Rankboost is a well-known algorithm that iteratively creates and aggregates a collection of “weak rankers” to build an effective ranking procedure. Initial work on Rankboost proposed two variants. One variant, that we call Rb-d and which is designed for the scenario where all weak rankers have the binary range $$\{0,1\}$$, has good theoretical properties, but does not perform well in practice. The other, that we call Rb-c, has good empirical behavior and is the recommended variation for this binary weak ranker scenario but lacks a theoretical grounding. In this paper, we rectify this situation by proposing an improved Rankboost algorithm for the binary weak ranker scenario that we call Rankboost$$+$$. We prove that this approach is theoretically sound and also show empirically that it outperforms both Rankboost variants in practice. Further, the theory behind Rankboost$$+$$ helps us to explain why Rb-d may not perform well in practice, and why Rb-c is better behaved in the binary weak ranker scenario, as has been observed in prior work.

[1] Tie-Yan Liu,et al. Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[2] Yoram Singer,et al. An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[3] Rong Jin,et al. Learning to Rank by Optimizing NDCG Measure , 2009, NIPS.

[4] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[5] Tao Qin,et al. LETOR: A benchmark collection for research on learning to rank for information retrieval , 2010, Information Retrieval.

[6] Shivani Agarwal,et al. Ranking Chemical Structures for Drug Discovery: A New Machine Learning Approach , 2010, J. Chem. Inf. Model..

[7] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[8] Alexander J. Smola,et al. Maximum Margin Matrix Factorization for Collaborative Ranking , 2007 .

[9] Janez Demsar,et al. Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[10] Yoram Singer,et al. Learning to Order Things , 1997, NIPS.

[11] John Guiver,et al. Bayesian inference for Plackett-Luce ranking models , 2009, ICML '09.

[12] Tao Qin,et al. Introducing LETOR 4.0 Datasets , 2013, ArXiv.

[13] F. Maxwell Harper,et al. The MovieLens Datasets: History and Context , 2016, TIIS.

[14] Tie-Yan Liu,et al. Listwise approach to learning to rank: theory and algorithm , 2008, ICML '08.

[15] Jaana Kekäläinen,et al. Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[16] Emine Yilmaz,et al. Document selection methodologies for efficient and effective learning-to-rank , 2009, SIGIR.

[17] Rabia Nuray-Turan,et al. Automatic ranking of information retrieval systems using data fusion , 2006, Inf. Process. Manag..

[18] Ameet Talwalkar,et al. Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[19] Cynthia Rudin,et al. Margin-Based Ranking Meets Boosting in the Middle , 2005, COLT.

[20] Hongyuan Zha,et al. Query-level learning to rank using isotonic regression , 2008, 2008 46th Annual Allerton Conference on Communication, Control, and Computing.

[21] Mehryar Mohri,et al. AUC Optimization vs. Error Rate Minimization , 2003, NIPS.

[22] James Fan,et al. Learning to rank for robust question answering , 2012, CIKM.