Learning to Rank for Information Retrieval Using Genetic Programming

central problem of information retrieval (IR) is to determine which documents are relevant and which are not to the user information need. This problem is practically handled by a ranking function which defines an ordering among documents according to their degree of relevance to the user query. This paper discusses work on using machine learning to automatically generate an effective ranking function for IR. This task is referred to as "learning to rank for IR" in the field. In this paper, a learning method, RankGP, is presented to address this task. RankGP employs genetic programming to learn a ranking function by combining various types of evidences in IR, including content features, structure features, and query-independent features. The proposed method is evaluated using the LETOR benchmark datasets and found to be competitive with Ranking SVM and RankBoost.

[1]  Ralf Herbrich,et al.  Large margin rank boundaries for ordinal regression , 2000 .

[2]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[3]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[4]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[5]  Wei-Pang Yang,et al.  Designing a classifier by a layered multi-population genetic programming approach , 2007, Pattern Recognit..

[6]  Harris Wu,et al.  The effects of fitness functions on genetic programming-based ranking discovery forWeb search , 2004, J. Assoc. Inf. Sci. Technol..

[7]  Stephen E. Robertson,et al.  Overview of the Okapi projects , 1997, J. Documentation.

[8]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[9]  Christopher J. C. Burges,et al.  High accuracy retrieval with multiple nested ranker , 2006, SIGIR.

[10]  Tie-Yan Liu,et al.  Adapting ranking SVM to document retrieval , 2006, SIGIR.

[11]  Ramesh Nallapati,et al.  Discriminative models for information retrieval , 2004, SIGIR '04.

[12]  John R. Koza,et al.  Genetic Programming IV: Routine Human-Competitive Machine Intelligence , 2003 .

[13]  Qiang Yang,et al.  Exploiting the hierarchical structure for link analysis , 2005, SIGIR '05.

[14]  Min Zhao,et al.  Ranking definitions with supervised learning methods , 2005, WWW '05.

[15]  Lalit M. Patnaik,et al.  Application of genetic programming for multicategory pattern classification , 2000, IEEE Trans. Evol. Comput..

[16]  Weiguo Fan,et al.  Discovery of context-specific ranking functions for effective information retrieval using genetic programming , 2004, IEEE Transactions on Knowledge and Data Engineering.

[17]  Tao Qin,et al.  LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval , 2007 .

[18]  Tao Qin,et al.  A study of relevance propagation for web search , 2005, SIGIR '05.

[19]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[20]  Weiguo Fan,et al.  A generic ranking function discovery framework by genetic programming for information retrieval , 2004, Inf. Process. Manag..

[21]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[22]  Hwanjo Yu,et al.  SVM selective sampling for ranking with application to data retrieval , 2005, KDD '05.

[23]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[24]  Koby Crammer,et al.  Pranking with Ranking , 2001, NIPS.

[25]  Azadeh Shakery,et al.  Relevance Propagation for Topic Distillation UIUC TREC 2003 Web Track Experiments , 2003, TREC.

[26]  Brian D. Davison,et al.  Topical link analysis for web search , 2006, SIGIR.

[27]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .