ES-Rank: evolution strategy learning to rank approach

Learning to Rank (LTR) is one of the current problems in Information Retrieval (IR) that attracts the attention from researchers. The LTR problem is mainly about ranking the retrieved documents for users in search engines, question answering and product recommendation systems. There are a number of LTR approaches from the areas of machine learning and computational intelligence. Most approaches have the limitation of being too slow or not being very effective. This paper investigates the application of evolutionary computation, specifically a (1+1) Evolutionary Strategy called ES-Rank, to tackle the LTR problem. Experimental results from comparing the proposed method to fourteen other approaches from the literature, show that ES-Rank achieves the overall best performance. Three datasets (MQ2007, MQ2008 and MSLR-WEB10K) from the LETOR benchmark collection and two performance metrics, Mean Average Precision (MAP) and Normalized Discounted Cumulative Gain (NDCG) at top-10 query-document pairs retrieved, were used in the experiments. The contribution of this paper is an effective and efficient method for the LTR problem.

[1]  Hans-Paul Schwefel,et al.  Evolution strategies – A comprehensive introduction , 2002, Natural Computing.

[2]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[3]  Christopher J. C. Burges,et al.  From RankNet to LambdaRank to LambdaMART: An Overview , 2010 .

[4]  Gianluca Demartini,et al.  Pooling-based continuous evaluation of information retrieval systems , 2015, Information Retrieval Journal.

[5]  Xin Yan,et al.  Linear Regression Analysis: Theory and Computing , 2009 .

[6]  Dario Landa Silva,et al.  Term frequency with average term occurrences for textual information retrieval , 2016, Soft Comput..

[7]  Hang Li,et al.  AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[8]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[9]  Hang Li Learning to Rank for Information Retrieval and Natural Language Processing , 2011, Synthesis Lectures on Human Language Technologies.

[10]  Tao Qin,et al.  LETOR: A benchmark collection for research on learning to rank for information retrieval , 2010, Information Retrieval.

[11]  Kilian Q. Weinberger,et al.  Web-Search Ranking with Initialized Gradient Boosted Regression Trees , 2010, Yahoo! Learning to Rank Challenge.

[12]  D. Sculley,et al.  Combined regression and ranking , 2010, KDD.

[13]  Wei-Pang Yang,et al.  Designing a classifier by a layered multi-population genetic programming approach , 2007, Pattern Recognit..

[14]  Yi Chang,et al.  Yahoo! Learning to Rank Challenge Overview , 2010, Yahoo! Learning to Rank Challenge.

[15]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[16]  Ricardo Baeza-Yates,et al.  Modern Information Retrieval - the concepts and technology behind search, Second edition , 2011 .

[17]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[18]  Julián Urbano,et al.  Test collection reliability: a study of bias and robustness to statistical assumptions via stochastic simulation , 2016, Information Retrieval Journal.

[19]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[20]  Kui-Lam Kwok Comparing representations in Chinese information retrieval , 1997, SIGIR '97.

[21]  Hang Li,et al.  Book Reviews: Semantic Similarity from Natural Language and Ontology Analysis by Sébastien Harispe, Sylvie Ranwez, Stefan Janaqi, and Jacky Montmain , 2015, CL.

[22]  Tao Qin,et al.  Introducing LETOR 4.0 Datasets , 2013, ArXiv.

[23]  W. Bruce Croft,et al.  Linear feature-based models for information retrieval , 2007, Information Retrieval.

[24]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[25]  Jen-Yuan Yeh,et al.  Learning to rank for information retrieval using layered multi-population genetic programming , 2012, 2012 IEEE International Conference on Computational Intelligence and Cybernetics (CyberneticsCom).

[26]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .