LETOR: A benchmark collection for research on learning to rank for information retrieval

LETOR is a benchmark collection for the research on learning to rank for information retrieval, released by Microsoft Research Asia. In this paper, we describe the details of the LETOR collection and show how it can be used in different kinds of researches. Specifically, we describe how the document corpora and query sets in LETOR are selected, how the documents are sampled, how the learning features and meta information are extracted, and how the datasets are partitioned for comprehensive evaluation. We then compare several state-of-the-art learning to rank algorithms on LETOR, report their ranking performances, and make discussions on the results. After that, we discuss possible new research topics that can be supported by LETOR, in addition to algorithm comparison. We hope that this paper can help people to gain deeper understanding of LETOR, and enable more interesting research projects on learning to rank and related topics.

[1]  Chris Buckley,et al.  OHSUMED: an interactive retrieval evaluation and new large test collection for research , 1994, SIGIR '94.

[2]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[3]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[4]  Stephen E. Robertson,et al.  The TREC-8 Filtering Track Final Report , 1999, TREC.

[5]  Klaus Obermayer,et al.  Support vector learning for ordinal regression , 1999 .

[6]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[7]  Michael Collins,et al.  Ranking Algorithms for Named Entity Extraction: Boosting and the VotedPerceptron , 2002, ACL.

[8]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[9]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[10]  William W. Cohen,et al.  Beyond independent relevance: methods and evaluation metrics for subtopic retrieval , 2003, SIGIR.

[11]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[12]  David Hawking,et al.  Overview of the TREC 2003 Web Track , 2003, TREC.

[13]  Edward F. Harrington,et al.  Online Ranking/Collaborative Filtering Using the Perceptron Algorithm , 2003, ICML.

[14]  Azadeh Shakery,et al.  Relevance Propagation for Topic Distillation UIUC TREC 2003 Web Track Experiments , 2003, TREC.

[15]  Stephen E. Robertson,et al.  Simple BM25 extension to multiple weighted fields , 2004, CIKM '04.

[16]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[17]  Hector Garcia-Molina,et al.  Combating Web Spam with TrustRank , 2004, VLDB.

[18]  David Hawking,et al.  Overview of the TREC 2004 Web Track , 2004, TREC.

[19]  Wolfgang Nejdl,et al.  MailRank: using ranking for spam detection , 2005, CIKM '05.

[20]  Qiang Yang,et al.  Exploiting the hierarchical structure for link analysis , 2005, SIGIR '05.

[21]  Shuming Shi,et al.  Title extraction from bodies of HTML documents and its application to web page retrieval , 2005, SIGIR '05.

[22]  Tao Qin,et al.  A study of relevance propagation for web search , 2005, SIGIR '05.

[23]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[24]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[25]  Min Zhao,et al.  Ranking definitions with supervised learning methods , 2005, WWW '05.

[26]  Ellen M. Voorhees,et al.  TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing) , 2005 .

[27]  Brian D. Davison,et al.  Topical link analysis for web search , 2006, SIGIR.

[28]  Ling Li,et al.  Ordinal Regression by Extended Binary Classification , 2006, NIPS.

[29]  Tie-Yan Liu,et al.  Adapting ranking SVM to document retrieval , 2006, SIGIR.

[30]  Christopher J. C. Burges,et al.  High accuracy retrieval with multiple nested ranker , 2006, SIGIR.

[31]  Filip Radlinski,et al.  A support vector method for optimizing average precision , 2007, SIGIR.

[32]  Tao Qin,et al.  Ranking with multiple hyperplanes , 2007, SIGIR.

[33]  José Luis Vicedo González,et al.  TREC: Experiment and evaluation in information retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[34]  Hang Li,et al.  AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[35]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[36]  Qiang Wu,et al.  McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.

[37]  Tao Qin,et al.  FRank: a ranking method with fidelity loss , 2007, SIGIR.

[38]  Harry Shum,et al.  Query Dependent Ranking Using K-nearest Neighbor * , 2022 .

[39]  Tie-Yan Liu,et al.  Directly optimizing evaluation measures in learning to rank , 2008, SIGIR.

[40]  Tao Qin,et al.  Global Ranking Using Continuous Conditional Random Fields , 2008, NIPS.

[41]  Tao Qin,et al.  Learning to rank relational objects and its application to web search , 2008, WWW.

[42]  Tao Qin,et al.  How to Make LETOR More Useful and Reliable , 2008 .

[43]  T. Minka Selection bias in the LETOR datasets , 2008 .

[44]  Pieter Abbeel,et al.  Max-margin Classification of Data with Absent Features , 2008, J. Mach. Learn. Res..

[45]  Tie-Yan Liu,et al.  Listwise approach to learning to rank: theory and algorithm , 2008, ICML '08.

[46]  Shuaiqiang Wang,et al.  Directly optimizing evaluation measures in learning to rank based on the clonal selection algorithm , 2008, SIGIR '08.

[47]  Stephen E. Robertson,et al.  SoftRank: optimizing non-smooth rank metrics , 2008, WSDM '08.

[48]  Brendan J. Frey,et al.  Structured ranking learning using cumulative distribution networks , 2008, NIPS.

[49]  Tao Qin,et al.  Query-level loss functions for information retrieval , 2008, Inf. Process. Manag..

[50]  Tao Qin,et al.  A general approximation framework for direct optimization of information retrieval measures , 2010, Information Retrieval.

[51]  Maksims Volkovs,et al.  BoltzRank: learning to maximize expected ranking gain , 2009, ICML '09.

[52]  Alfred C. Weaver,et al.  Learning to rank results in relational keyword search , 2011, CIKM '11.