Book Reviews: Semantic Similarity from Natural Language and Ontology Analysis by Sébastien Harispe, Sylvie Ranwez, Stefan Janaqi, and Jacky Montmain

Learning to rank refers to machine learning techniques for training a model in a ranking task. Learning to rank is useful for many applications in information retrieval, natural language processing, and data mining. Intensive studies have been conducted on its problems recently, and significant progress has been made. This lecture gives an introduction to the area including the fundamental problems, major approaches, theories, applications, and future work. The author begins by showing that various ranking problems in information retrieval and natural language processing can be formalized as two basic ranking tasks, namely ranking creation (or simply ranking) and ranking aggregation. In ranking creation, given a request, one wants to generate a ranking list of offerings based on the features derived from the request and the offerings. In ranking aggregation, given a request, as well as a number of ranking lists of offerings, one wants to generate a new ranking list of the offerings. Ranking creation (or ranking) is the major problem in learning to rank. It is usually formalized as a supervised learning task. The author gives detailed explanations on learning for ranking creation and ranking aggregation, including training and testing, evaluation, feature creation, and major approaches. Many methods have been proposed for ranking creation. The methods can be categorized as the pointwise, pairwise, and listwise approaches according to the loss functions they employ. They can also be categorized according to the techniques they employ, such as the SVM based, Boosting based, and Neural Network based approaches. The author also introduces some popular learning to rank methods in details. These include: PRank, OC SVM, McRank, Ranking SVM, IR SVM, GBRank, RankNet, ListNet & ListMLE, AdaRank, SVM MAP, SoftRank, LambdaRank, LambdaMART, Borda Count, Markov Chain, and CRanking. The author explains several example applications of learning to rank including web search, collaborative filtering, definition search, keyphrase extraction, query dependent summarization, and re-ranking in machine translation. A formulation of learning for ranking creation is given in the statistical learning framework. Ongoing and future research directions for learning to rank are also discussed.

[1]  Javed A. Aslam,et al.  Models for metasearch , 2001, SIGIR '01.

[2]  Robert J. Gaizauskas,et al.  Annotating Events and Temporal Information in Newswire Texts , 2000, LREC.

[3]  Oren Etzioni,et al.  Named Entity Recognition in Tweets: An Experimental Study , 2011, EMNLP.

[4]  Maksims Volkovs,et al.  BoltzRank: learning to maximize expected ranking gain , 2009, ICML '09.

[5]  Xueqi Cheng,et al.  Is top-k sufficient for ranking? , 2013, CIKM.

[6]  Tie-Yan Liu,et al.  Listwise approach to learning to rank: theory and algorithm , 2008, ICML '08.

[7]  Stephen E. Robertson,et al.  SoftRank: optimizing non-smooth rank metrics , 2008, WSDM '08.

[8]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[9]  Tie-Yan Liu,et al.  Directly optimizing evaluation measures in learning to rank , 2008, SIGIR.

[10]  Hongyuan Zha,et al.  A regression framework for learning ranking functions using relative relevance judgments , 2007, SIGIR.

[11]  Tie-Yan Liu,et al.  Generalization analysis of listwise learning-to-rank algorithms , 2009, ICML '09.

[12]  Hongyuan Zha,et al.  Multi-task learning for learning to rank in web search , 2009, CIKM.

[13]  Michael Gertz,et al.  Event-centric search and exploration in document collections , 2012, JCDL '12.

[14]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[15]  Emine Yilmaz,et al.  Document selection methodologies for efficient and effective learning-to-rank , 2009, SIGIR.

[16]  Tie-Yan Liu,et al.  BrowseRank: letting web users vote for page importance , 2008, SIGIR '08.

[17]  Christopher J. C. Burges,et al.  High accuracy retrieval with multiple nested ranker , 2006, SIGIR.

[18]  James Pustejovsky,et al.  Temporal and Event Information in Natural Language Text , 2005, Lang. Resour. Evaluation.

[19]  Michael Gertz,et al.  HeidelTime: High Quality Rule-Based Extraction and Normalization of Temporal Expressions , 2010, *SEMEVAL.

[20]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[21]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[22]  Anna Rumshisky,et al.  Evaluating temporal relations in clinical text: 2012 i2b2 Challenge , 2013, J. Am. Medical Informatics Assoc..

[23]  Tapas Kanungo,et al.  On composition of a federated web search result page: using online users to provide pairwise preference for heterogeneous verticals , 2011, WSDM '11.

[24]  Qiang Wu,et al.  McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.

[25]  James Pustejovsky,et al.  Automatic transformation from TIDES to TimeML annotation , 2011, Lang. Resour. Evaluation.

[26]  Michael Gertz,et al.  HeidelTime: Tuning English and Developing Spanish Resources for TempEval-3 , 2013, *SEMEVAL.

[27]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[28]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[29]  Tong Zhang,et al.  Subset Ranking Using Regression , 2006, COLT.

[30]  John D. Lafferty,et al.  Cranking: Combining Rankings Using Conditional Probability Models on Permutations , 2002, ICML.

[31]  Xueqi Cheng,et al.  Top-k learning to rank: labeling, ranking and evaluation , 2012, SIGIR '12.

[32]  Tao Qin,et al.  Ranking with multiple hyperplanes , 2007, SIGIR.

[33]  Filip Radlinski,et al.  Query chains: learning to rank from implicit feedback , 2005, KDD '05.

[34]  John Guiver,et al.  Learning to rank with SoftRank and Gaussian processes , 2008, SIGIR '08.

[35]  Tao Qin,et al.  LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval , 2007 .

[36]  Maksims Volkovs,et al.  Learning to rank with multiple objective functions , 2011, WWW.

[37]  Hang Li,et al.  Relevance Ranking Using Kernels , 2010, AIRS.

[38]  Massih-Reza Amini,et al.  A boosting algorithm for learning bipartite ranking functions with partially labeled data , 2008, SIGIR '08.

[39]  Brian D. Davison,et al.  Learning to rank for freshness and relevance , 2011, SIGIR.

[40]  Patrick Gallinari,et al.  Ranking with ordered weighted pairwise classification , 2009, ICML '09.

[41]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[42]  Jimmy J. Lin,et al.  A cascade ranking model for efficient ranked retrieval , 2011, SIGIR.

[43]  M. Gertz,et al.  WikiWarsDE : A German Corpus of Narratives Annotated with Temporal Expressions , 2011 .

[44]  Prasenjit Majumder,et al.  Approaches to Temporal Expression Recognition in Hindi , 2015, ACM Trans. Asian Low Resour. Lang. Inf. Process..

[45]  Yong Yu,et al.  Learning to rank with ties , 2008, SIGIR '08.

[46]  Tommaso Caselli,et al.  SemEval-2010 Task 13: TempEval-2 , 2010, *SEMEVAL.

[47]  Xavier Tannier Extracting News Web Page Creation Time with DCTFinder , 2014, LREC.

[48]  Leon Derczynski,et al.  Analysis of Temporal Expressions Annotated in Clinical Notes , 2015, ACL 2015.

[49]  Rong Jin,et al.  Learning to Rank by Optimizing NDCG Measure , 2009, NIPS.

[50]  Enhong Chen,et al.  Context-aware ranking in web search , 2010, SIGIR '10.

[51]  Mingrui Wu,et al.  Gradient descent optimization of smoothed information retrieval metrics , 2010, Information Retrieval.

[52]  James Pustejovsky,et al.  Temporal Processing with the TARSQI Toolkit , 2008, COLING.

[53]  Gilad Mishne,et al.  Towards recency ranking in web search , 2010, WSDM '10.

[54]  Filip Radlinski,et al.  Active exploration for learning rankings from clickthrough data , 2007, KDD '07.

[55]  Michael Gertz,et al.  Time for More Languages , 2014, ACM Trans. Asian Lang. Inf. Process..

[56]  Tie-Yan Liu,et al.  Learning to rank for information retrieval (LR4IR 2007) , 2007, SIGF.

[57]  Tie-Yan Liu,et al.  Introduction to special issue on learning to rank for information retrieval , 2010, Information Retrieval.

[58]  Rong Jin,et al.  Semi-Supervised Ensemble Ranking , 2008, AAAI.

[59]  James Pustejovsky,et al.  ISO-TimeML: An International Standard for Semantic Annotation , 2010, LREC.

[60]  Tie-Yan Liu,et al.  Ranking Measures and Loss Functions in Learning to Rank , 2009, NIPS.

[61]  Rafael Muñoz,et al.  Automatic Multilinguality for Time Expression Resolution , 2004, MICAI.

[62]  Maksims Volkovs,et al.  New learning methods for supervised and unsupervised preference aggregation , 2014, J. Mach. Learn. Res..

[63]  Michael Gertz,et al.  Extraction and exploration of spatio-temporal information in documents , 2010, GIR.

[64]  Tie-Yan Liu,et al.  Adapting ranking SVM to document retrieval , 2006, SIGIR.

[65]  Tao Qin,et al.  LETOR: A benchmark collection for research on learning to rank for information retrieval , 2010, Information Retrieval.

[66]  Michael Gertz,et al.  A Baseline Temporal Tagger for all Languages , 2015, EMNLP.

[67]  Hang Li,et al.  Ranking refinement and its application to information retrieval , 2008, WWW.

[68]  Tao Tao,et al.  An exploration of proximity measures in information retrieval , 2007, SIGIR.

[69]  Qiang Wu,et al.  Adapting boosting for information retrieval measures , 2010, Information Retrieval.

[70]  Zheng Chen,et al.  Knowledge transfer for cross domain learning to rank , 2010, Information Retrieval.

[71]  Rodrygo L. T. Santos,et al.  The whens and hows of learning to rank for web search , 2012, Information Retrieval.

[72]  James Pustejovsky,et al.  The TempEval challenge: identifying temporal relations in text , 2009, Lang. Resour. Evaluation.

[73]  Filip Radlinski,et al.  Learning diverse rankings with multi-armed bandits , 2008, ICML '08.

[74]  James Pustejovsky,et al.  SemEval-2013 Task 1: TempEval-3: Evaluating Time Expressions, Events, and Temporal Relations , 2013, *SEMEVAL.

[75]  Jaime G. Carbonell,et al.  Fast learning of document ranking functions with the committee perceptron , 2008, WSDM '08.

[76]  Alan Halverson,et al.  Generating labels from clicks , 2009, WSDM '09.

[77]  Thorsten Joachims,et al.  Predicting diverse subsets using structural SVMs , 2008, ICML '08.

[78]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[79]  Jannik Strötgen,et al.  Domain-sensitive temporal tagging for event-centric information retrieval , 2015 .

[80]  Yoram Singer,et al.  Learning to Order Things , 1997, NIPS.

[81]  Tao Qin,et al.  A general approximation framework for direct optimization of information retrieval measures , 2010, Information Retrieval.

[82]  Tao Qin,et al.  Learning to rank relational objects and its application to web search , 2008, WWW.

[83]  Hang Li,et al.  Semantic Matching in Search , 2014, SMIR@SIGIR.

[84]  Eyke Hüllermeier,et al.  Decision tree and instance-based learning for label ranking , 2009, ICML '09.

[85]  John Guiver,et al.  Bayesian inference for Plackett-Luce ranking models , 2009, ICML '09.

[86]  Maksims Volkovs,et al.  A flexible generative model for preference aggregation , 2012, WWW.

[87]  Zhaohui Zheng,et al.  Learning Recurrent Event Queries for Web Search , 2010, EMNLP.

[88]  Wei Gao,et al.  Learning to rank only using training data from related domain , 2010, SIGIR.

[89]  Qinghua Zheng,et al.  Automatic extraction of titles from general documents using machine learning , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[90]  Tao Qin,et al.  Query-level loss functions for information retrieval , 2008, Inf. Process. Manag..

[91]  Tao Qin,et al.  FRank: a ranking method with fidelity loss , 2007, SIGIR.

[92]  Hongyuan Zha,et al.  Global ranking by exploiting user clicks , 2009, SIGIR.

[93]  Filip Radlinski,et al.  A support vector method for optimizing average precision , 2007, SIGIR.

[94]  Tao Qin,et al.  Query-level stability and generalization in learning to rank , 2008, ICML '08.

[95]  Michael Gertz,et al.  Temporal Tagging on Different Domains: Challenges, Strategies, and Gold Standards , 2012, LREC.

[96]  Hang Li,et al.  AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[97]  Dan Roth,et al.  Unsupervised rank aggregation with distance-based models , 2008, ICML '08.

[98]  Kevin Duh,et al.  Learning to rank with partially-labeled data , 2008, SIGIR '08.

[99]  Klaus Obermayer,et al.  Support vector learning for ordinal regression , 1999 .

[100]  Alexander J. Smola,et al.  Direct Optimization of Ranking Measures , 2007, ArXiv.

[101]  Karel Jezek,et al.  Two uses of anaphora resolution in summarization , 2007, Inf. Process. Manag..

[102]  W. Bruce Croft,et al.  Quality-biased ranking of web documents , 2011, WSDM '11.

[103]  Harry Shum,et al.  Query Dependent Ranking Using K-nearest Neighbor * , 2022 .

[104]  Lidan Wang,et al.  Learning to efficiently rank , 2010, SIGIR.

[105]  Henning Christiansen,et al.  Resolving Relative Time Expressions in Dutch Text with Constraint Handling Rules , 2012, CSLP.

[106]  Ya Zhang,et al.  Multi-task learning for boosting with application to web search ranking , 2010, KDD.

[107]  Frank Schilder,et al.  From Temporal Expressions To Temporal Information: Semantic Tagging Of News Messages , 2001, The Language of Time - A Reader.

[108]  Pinar Donmez,et al.  On the local optimality of LambdaRank , 2009, SIGIR.

[109]  Xin Jiang,et al.  A ranking approach to keyphrase extraction , 2009, SIGIR.

[110]  S. Sathiya Keerthi,et al.  Efficient algorithms for ranking with SVMs , 2010, Information Retrieval.

[111]  Jianfeng Gao,et al.  Linear discriminant model for information retrieval , 2005, SIGIR '05.

[112]  Tapas Kanungo,et al.  Machine Learned Sentence Selection Strategies for Query-Biased Summarization , 2008 .

[113]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[114]  Min Zhao,et al.  Ranking definitions with supervised learning methods , 2005, WWW '05.

[115]  James Pustejovsky,et al.  Automating Temporal Annotation with TARSQI , 2005, ACL.

[116]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[117]  Ya Zhang,et al.  Active Learning for Ranking through Expected Loss Optimization , 2010, IEEE Transactions on Knowledge and Data Engineering.

[118]  Tie-Yan Liu,et al.  Learning to rank for information retrieval (LR4IR 2008) , 2008, SIGF.

[119]  Chen Lin,et al.  Temporal Annotation in the Clinical Domain , 2014, TACL.

[120]  Michael Gertz,et al.  Multilingual and cross-domain temporal tagging , 2012, Language Resources and Evaluation.

[121]  Rafael Muñoz,et al.  Event Ordering Using TERSEO System , 2004, NLDB.

[122]  Wei Chu,et al.  Gaussian Processes for Ordinal Regression , 2005, J. Mach. Learn. Res..

[123]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[124]  Eric Brill,et al.  Improving web search ranking by incorporating user behavior information , 2006, SIGIR.

[125]  Stephen E. Robertson,et al.  Optimisation methods for ranking functions with multiple parameters , 2006, CIKM '06.

[126]  Michael Gertz,et al.  Extending HeidelTime for Temporal Expressions Referring to Historic Dates , 2014, LREC.

[127]  Tao Qin,et al.  Feature selection for ranking , 2007, SIGIR.

[128]  J. Šnajder,et al.  H R : Extracting and Normalizing Temporal Expressions in Croatian , 2014 .

[129]  Anna Rumshisky,et al.  Annotating temporal information in clinical narratives , 2013, J. Biomed. Informatics.

[130]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[131]  Kathrin Spreyer,et al.  Projection-based Acquisition of a Temporal Labeller , 2008, IJCNLP.

[132]  Estela Saquete,et al.  ID 392: TERSEO + T2T3 Transducer: a systems for recognizing and normalizing TIMEX3 , 2010 .

[133]  Hector Garcia-Molina,et al.  Combating Web Spam with TrustRank , 2004, VLDB.

[134]  Zhi-Hua Zhou,et al.  Semi-supervised document retrieval , 2009, Inf. Process. Manag..

[135]  James Pustejovsky,et al.  The TARSQI Toolkit , 2012, LREC.

[136]  Wei Chu,et al.  New approaches to support vector ordinal regression , 2005, ICML.

[137]  Stéphan Clémençon,et al.  Empirical performance maximization for linear rank statistics , 2008, NIPS.

[138]  Alexander J. Smola,et al.  IntervalRank: isotonic regression with listwise and pairwise constraints , 2010, WSDM '10.

[139]  Anoop Sarkar,et al.  Discriminative Reranking for Machine Translation , 2004, NAACL.

[140]  Chiranjib Bhattacharyya,et al.  Structured learning for non-smooth ranking losses , 2008, KDD.

[141]  Soumen Chakrabarti,et al.  Learning to rank networked entities , 2006, KDD '06.

[142]  Tie-Yan Liu Learning to Rank for Information Retrieval , 2009, Found. Trends Inf. Retr..

[143]  Mehryar Mohri,et al.  Magnitude-preserving ranking algorithms , 2007, ICML '07.

[144]  James Pustejovsky,et al.  SemEval-2007 Task 15: TempEval Temporal Relation Identification , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[145]  Milad Shokouhi,et al.  Time-sensitive query auto-completion , 2012, SIGIR '12.

[146]  Tao Qin,et al.  Supervised rank aggregation , 2007, WWW '07.

[147]  Michael Gertz,et al.  CONQUER: a system for efficient context-aware query suggestions , 2011, WWW.

[148]  Banu Diri,et al.  TimeML and Turkish Temporal Logic , 2010, IC-AI.

[149]  W. Bruce Croft,et al.  A Markov random field model for term dependencies , 2005, SIGIR '05.

[150]  Carlota Smith The syntax and interpretation of temporal expressions in English , 1978 .