Active Learning for Ranking through Expected Loss Optimization

Learning to rank arises in many data mining applications, ranging from web search engine, online advertising to recommendation system. In learning to rank, the performance of a ranking model is strongly affected by the number of labeled examples in the training set; on the other hand, obtaining labeled examples for training data is very expensive and time-consuming. This presents a great need for the active learning approaches to select most informative examples for ranking learning; however, in the literature there is still very limited work to address active learning for ranking. In this paper, we propose a general active learning framework, expected loss optimization (ELO), for ranking. The ELO framework is applicable to a wide range of ranking functions. Under this framework, we derive a novel algorithm, expected discounted cumulative gain (DCG) loss optimization (ELO-DCG), to select most informative examples. Then, we investigate both query and document level active learning for raking and propose a two-stage ELO-DCG algorithm which incorporate both query and document selection into active learning. Furthermore, we show that it is flexible for the algorithm to deal with the skewed grade distribution problem with the modification of the loss function. Extensive experiments on real-world web search data sets have demonstrated great potential and effectiveness of the proposed framework and algorithms.

[1]  Ralf Herbrich,et al.  Large margin rank boundaries for ordinal regression , 2000 .

[2]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[3]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[4]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[5]  Adriano Veloso,et al.  Rule-Based Active Sampling for Learning to Rank , 2011, ECML/PKDD.

[6]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[7]  Tie-Yan Liu,et al.  Listwise approach to learning to rank: theory and algorithm , 2008, ICML '08.

[8]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[9]  Mehryar Mohri,et al.  Magnitude-preserving ranking algorithms , 2007, ICML '07.

[10]  Tong Zhang,et al.  Subset Ranking Using Regression , 2006, COLT.

[11]  Naoki Abe,et al.  Query Learning Strategies Using Boosting and Bagging , 1998, ICML.

[12]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[13]  Li Wang,et al.  Query sampling for ranking learning in web search , 2009, SIGIR.

[14]  Jaime G. Carbonell,et al.  Optimizing estimated loss reduction for active sampling in rank learning , 2008, ICML '08.

[15]  Hang Li,et al.  AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[16]  Jaime G. Carbonell,et al.  Active Sampling for Rank Learning via Optimizing the Area under the ROC Curve , 2009, ECIR.

[17]  Wei Chu,et al.  Extensions of Gaussian Processes for Ranking : Semi-supervised and Active Learning , 2005 .

[18]  Hongyuan Zha,et al.  A General Boosting Method and its Application to Learning Ranking Functions for Web Search , 2007, NIPS.

[19]  John Guiver,et al.  Learning to rank with SoftRank and Gaussian processes , 2008, SIGIR '08.

[20]  Onno Zoeter Nick Craswell Michael Taylor John Guiver Ed Snelson A Decision Theoretic Framework for Implicit Relevance Feedback , 2007 .

[21]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[22]  Hwanjo Yu,et al.  SVM selective sampling for ranking with application to data retrieval , 2005, KDD '05.

[23]  Emine Yilmaz,et al.  Document selection methodologies for efficient and effective learning-to-rank , 2009, SIGIR.

[24]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[25]  Jun Wang,et al.  Active Learning to Rank using Pairwise Supervision , 2013, SDM.

[26]  Hongyuan Zha,et al.  A regression framework for learning ranking functions using relative relevance judgments , 2007, SIGIR.

[27]  Nello Cristianini,et al.  Query Learning with Large Margin Classi ersColin , 2000 .

[28]  Tadayoshi Fushiki Bootstrap prediction and Bayesian prediction under misspecified models , 2005 .

[29]  Ya Zhang,et al.  Active Learning for Ranking through Expected Loss Optimization , 2015, IEEE Trans. Knowl. Data Eng..

[30]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[31]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[32]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[33]  Stephen E. Robertson,et al.  Deep versus shallow judgments in learning to rank , 2009, SIGIR.

[34]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[35]  Shlomo Argamon,et al.  Committee-Based Sampling For Training Probabilistic Classi(cid:12)ers , 1995 .

[36]  James Allan,et al.  Minimal test collections for retrieval evaluation , 2006, SIGIR.