Learning to Rank for Information Retrieval and Natural Language Processing

Learning to rank refers to machine learning techniques for training the model in a ranking task. Learning to rank is useful for many applications in information retrieval, natural language processing, and data mining. Intensive studies have been conducted on the problem recently and significant progress has been made. This lecture gives an introduction to the area including the fundamental problems, existing approaches, theories, applications, and future work. The author begins by showing that various ranking problems in information retrieval and natural language processing can be formalized as two basic ranking tasks, namely ranking creation (or simply ranking) and ranking aggregation. In ranking creation, given a request, one wants to generate a ranking list of offerings based on the features derived from the request and the offerings. In ranking aggregation, given a request, as well as a number of ranking lists of offerings, one wants to generate a new ranking list of the offerings. Ranking creation (or ranking) is the major problem in learning to rank. It is usually formalized as a supervised learning task. The author gives detailed explanations on learning for ranking creation and ranking aggregation, including training and testing, evaluation, feature creation, and major approaches. Many methods have been proposed for ranking creation. The methods can be categorized as the pointwise, pairwise, and listwise approaches according to the loss functions they employ. They can also be categorized according to the techniques they employ, such as the SVM based, Boosting SVM, Neural Network based approaches. The author also introduces some popular learning to rank methods in details. These include PRank, OC SVM, Ranking SVM, IR SVM, GBRank, RankNet, LambdaRank, ListNet & ListMLE, AdaRank, SVM MAP, SoftRank, Borda Count, Markov Chain, and CRanking. The author explains several example applications of learning to rank including web search, collaborative filtering, definition search, keyphrase extraction, query dependent summarization, and re-ranking in machine translation. A formulation of learning for ranking creation is given in the statistical learning framework. Ongoing and future research directions for learning to rank are also discussed. Table of Contents: Introduction / Learning for Ranking Creation / Learning for Ranking Aggregation / Methods of Learning to Rank / Applications of Learning to Rank / Theory of Learning to Rank / Ongoing and Future Work

[1]  Hector Garcia-Molina,et al.  Combating Web Spam with TrustRank , 2004, VLDB.

[2]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[3]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[4]  Brendan J. Frey,et al.  Structured ranking learning using cumulative distribution networks , 2008, NIPS.

[5]  Min Zhao,et al.  Ranking definitions with supervised learning methods , 2005, WWW '05.

[6]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[7]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[8]  Susan T. Dumais,et al.  Improving Web Search Ranking by Incorporating User Behavior Information , 2019, SIGIR Forum.

[9]  John D. Lafferty,et al.  Cranking: Combining Rankings Using Conditional Probability Models on Permutations , 2002, ICML.

[10]  Tao Qin,et al.  Ranking with multiple hyperplanes , 2007, SIGIR.

[11]  Filip Radlinski,et al.  Query chains: learning to rank from implicit feedback , 2005, KDD '05.

[12]  John Guiver,et al.  Learning to rank with SoftRank and Gaussian processes , 2008, SIGIR '08.

[13]  Tao Qin,et al.  LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval , 2007 .

[14]  Hang Li,et al.  Relevance Ranking Using Kernels , 2010, AIRS.

[15]  Zhi-Hua Zhou,et al.  Semi-supervised document retrieval , 2009, Inf. Process. Manag..

[16]  Rong Jin,et al.  Semi-Supervised Ensemble Ranking , 2008, AAAI.

[17]  Shivani Agarwal,et al.  Stability and Generalization of Bipartite Ranking Algorithms , 2005, COLT.

[18]  Hongyuan Zha,et al.  A General Boosting Method and its Application to Learning Ranking Functions for Web Search , 2007, NIPS.

[19]  Patrick Gallinari,et al.  Ranking with ordered weighted pairwise classification , 2009, ICML '09.

[20]  Tie-Yan Liu,et al.  Learning to rank for information retrieval (LR4IR 2008) , 2008, SIGF.

[21]  Jaime G. Carbonell,et al.  Fast learning of document ranking functions with the committee perceptron , 2008, WSDM '08.

[22]  Amnon Shashua,et al.  Ranking with Large Margin Principle: Two Approaches , 2002, NIPS.

[23]  Qinghua Zheng,et al.  Automatic extraction of titles from general documents using machine learning , 2006, Inf. Process. Manag..

[24]  Javed A. Aslam,et al.  Models for metasearch , 2001, SIGIR '01.

[25]  Tie-Yan Liu,et al.  Adapting ranking SVM to document retrieval , 2006, SIGIR.

[26]  Alan Halverson,et al.  Generating labels from clicks , 2009, WSDM '09.

[27]  Alexander J. Smola,et al.  IntervalRank: isotonic regression with listwise and pairwise constraints , 2010, WSDM '10.

[28]  Tao Qin,et al.  LETOR: A benchmark collection for research on learning to rank for information retrieval , 2010, Information Retrieval.

[29]  Anoop Sarkar,et al.  Discriminative Reranking for Machine Translation , 2004, NAACL.

[30]  Chiranjib Bhattacharyya,et al.  Structured learning for non-smooth ranking losses , 2008, KDD.

[31]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[32]  Soumen Chakrabarti,et al.  Learning to rank networked entities , 2006, KDD '06.

[33]  Shuming Shi,et al.  Title extraction from bodies of HTML documents and its application to web page retrieval , 2005, SIGIR '05.

[34]  Tao Qin,et al.  Supervised rank aggregation , 2007, WWW '07.

[35]  Tapas Kanungo,et al.  On composition of a federated web search result page: using online users to provide pairwise preference for heterogeneous verticals , 2011, WSDM '11.

[36]  Thomas G. Dietterich,et al.  Editors. Advances in Neural Information Processing Systems , 2002 .

[37]  Qiang Wu,et al.  McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.

[38]  Thomas Hofmann,et al.  Learning to Rank with Nonsmooth Cost Functions , 2006, NIPS.

[39]  W. Bruce Croft,et al.  A Markov random field model for term dependencies , 2005, SIGIR '05.

[40]  Qiang Wu,et al.  Adapting boosting for information retrieval measures , 2010, Information Retrieval.

[41]  Zheng Chen,et al.  Knowledge transfer for cross domain learning to rank , 2010, Information Retrieval.

[42]  Tao Tao,et al.  An exploration of proximity measures in information retrieval , 2007, SIGIR.

[43]  Ya Zhang,et al.  Multi-task learning for boosting with application to web search ranking , 2010, KDD.

[44]  Massih-Reza Amini,et al.  A boosting algorithm for learning bipartite ranking functions with partially labeled data , 2008, SIGIR '08.

[45]  Yong Yu,et al.  Learning to rank with ties , 2008, SIGIR '08.

[46]  Tie-Yan Liu,et al.  Two-Layer Generalization Analysis for Ranking Using Rademacher Average , 2010, NIPS.

[47]  Emine Yilmaz,et al.  Document selection methodologies for efficient and effective learning-to-rank , 2009, SIGIR.

[48]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[49]  Tie-Yan Liu,et al.  BrowseRank: letting web users vote for page importance , 2008, SIGIR '08.

[50]  Tie-Yan Liu,et al.  Listwise approach to learning to rank: theory and algorithm , 2008, ICML '08.

[51]  Stephen E. Robertson,et al.  SoftRank: optimizing non-smooth rank metrics , 2008, WSDM '08.

[52]  Tong Zhang,et al.  Subset Ranking Using Regression , 2006, COLT.

[53]  W. Bruce Croft,et al.  A language modeling approach to information retrieval , 1998, SIGIR '98.

[54]  Thorsten Joachims,et al.  Predicting diverse subsets using structural SVMs , 2008, ICML '08.

[55]  Hongyuan Zha,et al.  Global ranking by exploiting user clicks , 2009, SIGIR.

[56]  Pinar Donmez,et al.  On the local optimality of LambdaRank , 2009, SIGIR.

[57]  Xin Jiang,et al.  A ranking approach to keyphrase extraction , 2009, SIGIR.

[58]  Wei Gao,et al.  Learning to rank only using training data from related domain , 2010, SIGIR.

[59]  S. Sathiya Keerthi,et al.  Efficient algorithms for ranking with SVMs , 2010, Information Retrieval.

[60]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[61]  Jianfeng Gao,et al.  Linear discriminant model for information retrieval , 2005, SIGIR '05.

[62]  Tapas Kanungo,et al.  Machine Learned Sentence Selection Strategies for Query-Biased Summarization , 2008 .

[63]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[64]  Filip Radlinski,et al.  Learning diverse rankings with multi-armed bandits , 2008, ICML '08.

[65]  Koby Crammer,et al.  Pranking with Ranking , 2001, NIPS.

[66]  Filip Radlinski,et al.  A support vector method for optimizing average precision , 2007, SIGIR.

[67]  Tao Qin,et al.  Query-level stability and generalization in learning to rank , 2008, ICML '08.

[68]  Hang Li,et al.  AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[69]  Dan Roth,et al.  Unsupervised rank aggregation with distance-based models , 2008, ICML '08.

[70]  Kevin Duh,et al.  Learning to rank with partially-labeled data , 2008, SIGIR '08.

[71]  W. Bruce Croft,et al.  Quality-biased ranking of web documents , 2011, WSDM '11.

[72]  Harry Shum,et al.  Query Dependent Ranking Using K-nearest Neighbor * , 2022 .

[73]  Yoram Singer,et al.  Learning to Order Things , 1997, NIPS.

[74]  Christopher J. C. Burges,et al.  High accuracy retrieval with multiple nested ranker , 2006, SIGIR.

[75]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[76]  Tie-Yan Liu,et al.  Generalization analysis of listwise learning-to-rank algorithms , 2009, ICML '09.

[77]  Mehryar Mohri,et al.  Magnitude-preserving ranking algorithms , 2007, ICML '07.

[78]  Filip Radlinski,et al.  Active exploration for learning rankings from clickthrough data , 2007, KDD '07.

[79]  Hang Li,et al.  Improving quality of training data for learning to rank using click-through data , 2010, WSDM '10.

[80]  Tie-Yan Liu,et al.  Ranking Measures and Loss Functions in Learning to Rank , 2009, NIPS.

[81]  Tao Qin,et al.  A general approximation framework for direct optimization of information retrieval measures , 2010, Information Retrieval.

[82]  Tao Qin,et al.  Learning to rank relational objects and its application to web search , 2008, WWW.

[83]  José Luis Vicedo González,et al.  TREC: Experiment and evaluation in information retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[84]  Eyke Hüllermeier,et al.  Decision tree and instance-based learning for label ranking , 2009, ICML '09.

[85]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[86]  John Guiver,et al.  Bayesian inference for Plackett-Luce ranking models , 2009, ICML '09.

[87]  Wei Chu,et al.  Gaussian Processes for Ordinal Regression , 2005, J. Mach. Learn. Res..

[88]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[89]  Stephen E. Robertson,et al.  Optimisation methods for ranking functions with multiple parameters , 2006, CIKM '06.

[90]  Enhong Chen,et al.  Context-aware ranking in web search , 2010, SIGIR '10.

[91]  Mingrui Wu,et al.  Gradient descent optimization of smoothed information retrieval metrics , 2010, Information Retrieval.

[92]  Gilad Mishne,et al.  Towards recency ranking in web search , 2010, WSDM '10.

[93]  Tao Qin,et al.  Query-level loss functions for information retrieval , 2008, Inf. Process. Manag..

[94]  Tao Qin,et al.  FRank: a ranking method with fidelity loss , 2007, SIGIR.

[95]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[96]  Edward F. Harrington,et al.  Online Ranking/Collaborative Filtering Using the Perceptron Algorithm , 2003, ICML.

[97]  Klaus Obermayer,et al.  Support vector learning for ordinal regression , 1999 .

[98]  Alexander J. Smola,et al.  Direct Optimization of Ranking Measures , 2007, ArXiv.

[99]  Tao Qin,et al.  Feature selection for ranking , 2007, SIGIR.

[100]  Tao Qin,et al.  Global Ranking Using Continuous Conditional Random Fields , 2008, NIPS.

[101]  Mehryar Mohri,et al.  An Efficient Reduction of Ranking to Classification , 2007, COLT.

[102]  Wei Chu,et al.  New approaches to support vector ordinal regression , 2005, ICML.

[103]  Stéphan Clémençon,et al.  Empirical performance maximization for linear rank statistics , 2008, NIPS.

[104]  Tie-Yan Liu,et al.  Statistical Consistency of Top-k Ranking , 2009, NIPS.

[105]  Maksims Volkovs,et al.  BoltzRank: learning to maximize expected ranking gain , 2009, ICML '09.

[106]  Tie-Yan Liu,et al.  Directly optimizing evaluation measures in learning to rank , 2008, SIGIR.

[107]  Hongyuan Zha,et al.  A regression framework for learning ranking functions using relative relevance judgments , 2007, SIGIR.

[108]  Hongyuan Zha,et al.  Multi-task learning for learning to rank in web search , 2009, CIKM.

[109]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[110]  Hang Li,et al.  Ranking refinement and its application to information retrieval , 2008, WWW.

[111]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[112]  Rong Jin,et al.  Learning to Rank by Optimizing NDCG Measure , 2009, NIPS.

[113]  Tie-Yan Liu,et al.  Learning to rank for information retrieval (LR4IR 2007) , 2007, SIGF.

[114]  Tie-Yan Liu,et al.  Introduction to special issue on learning to rank for information retrieval , 2010, Information Retrieval.