论文信息 - A large-scale study of the effect of training set characteristics over learning-to-rank algorithms

A large-scale study of the effect of training set characteristics over learning-to-rank algorithms

In this work we describe the results of a large-scale study on the effect of the distribution of labels across the different grades of relevance in the training set on the performance of trained ranking functions. In a controlled experiment we generate a large number of training datasets wih different label distributions and employ three learning to rank algo- rithms over these datasets. We investigate the effect of these distributions on the accuracy of obtained ranking functions to give an insight into the manner training sets should be constructed.

Evangelos Kanoulas | Pavel Metrikov | Javed A. Aslam | Virgil Pavlu | Stefan Savev

[1] Emine Yilmaz,et al. Document selection methodologies for efficient and effective learning-to-rank , 2009, SIGIR.

[2] Thorsten Joachims,et al. Training linear SVMs in linear time , 2006, KDD '06.

[3] Yoram Singer,et al. An Efficient Boosting Algorithm for Combining Preferences by , 2013 .