Two-Stage Learning to Rank for Information Retrieval

Current learning to rank approaches commonly focus on learning the best possible ranking function given a small fixed set of documents. This document set is often retrieved from the collection using a simple unsupervised bag-of-words method, e.g. BM25. This can potentially lead to learning a sub-optimal ranking, since many relevant documents may be excluded from the initially retrieved set. In this paper we propose a novel two-stage learning framework to address this problem. We first learn a ranking function over the entire retrieval collection using a limited set of textual features including weighted phrases, proximities and expansion terms. This function is then used to retrieve the best possible subset of documents over which the final model is trained using a larger set of query- and document-dependent features. Empirical evaluation using two web collections unequivocally demonstrates that our proposed two-stage framework, being able to learn its model from more relevant documents, outperforms current learning to rank approaches.

[1]  Gilad Mishne,et al.  Improving Web Search Relevance with Semantic Features , 2009, EMNLP.

[2]  Peter Ingwersen,et al.  Developing a Test Collection for the Evaluation of Integrated Search , 2010, ECIR.

[3]  Stephen E. Robertson,et al.  On the choice of effectiveness measures for learning to rank , 2010, Information Retrieval.

[4]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[5]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[6]  W. Bruce Croft,et al.  Parameterized concept weighting in verbose queries , 2011, SIGIR.

[7]  Leonid Boytsov,et al.  Evaluating Learning-to-Rank Methods in the Web Track Adhoc Task , 2011, TREC.

[8]  W. Bruce Croft,et al.  Learning concept importance using a weighted dependence model , 2010, WSDM '10.

[9]  Craig MacDonald,et al.  University of Glasgow at TREC 2011: Experiments with Terrier in Crowdsourcing, Microblog, and Web Tracks , 2011, TREC.

[10]  Tao Qin,et al.  LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval , 2007 .

[11]  Iadh Ounis,et al.  Incorporating term dependency in the dfr framework , 2007, SIGIR.

[12]  Craig MacDonald,et al.  Efficient Dynamic Pruning with Proximity Support , 2010, LSDS-IR@SIGIR.

[13]  W. Bruce Croft,et al.  Linear feature-based models for information retrieval , 2007, Information Retrieval.

[14]  Yi Chang,et al.  Yahoo! Learning to Rank Challenge Overview , 2010, Yahoo! Learning to Rank Challenge.

[15]  Emine Yilmaz,et al.  Document selection methodologies for efficient and effective learning-to-rank , 2009, SIGIR.

[16]  W. Bruce Croft,et al.  Effective query formulation with multiple information sources , 2012, WSDM '12.

[17]  W. Bruce Croft,et al.  A Markov random field model for term dependencies , 2005, SIGIR '05.

[18]  W. Bruce Croft,et al.  Latent concept expansion using markov random fields , 2007, SIGIR.

[19]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[20]  Quoc V. Le,et al.  Learning to Rank with Nonsmooth Cost Functions , 2006, Neural Information Processing Systems.

[21]  Shuming Shi,et al.  Effective top-k computation in retrieving structured documents with term-proximity support , 2007, CIKM '07.

[22]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[23]  Jaime G. Carbonell,et al.  Active Sampling for Rank Learning via Optimizing the Area under the ROC Curve , 2009, ECIR.

[24]  Pinar Donmez,et al.  On the local optimality of LambdaRank , 2009, SIGIR.

[25]  Rodrygo L. T. Santos,et al.  The whens and hows of learning to rank for web search , 2012, Information Retrieval.

[26]  Hang Li,et al.  AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[27]  W. Bruce Croft,et al.  Quality-biased ranking of web documents , 2011, WSDM '11.

[28]  Matthew Lease An improved markov random field model for supporting verbose queries , 2009, SIGIR.

[29]  Qiang Wu,et al.  Adapting boosting for information retrieval measures , 2010, Information Retrieval.