Active exploration for learning rankings from clickthrough data

We address the task of learning rankings of documents from search enginelogs of user behavior. Previous work on this problem has relied onpassively collected clickthrough data. In contrast, we show that anactive exploration strategy can provide data that leads to much fasterlearning. Specifically, we develop a Bayesian approach for selectingrankings to present users so that interactions result in more informativetraining data. Our results using the TREC-10 Web corpus, as well assynthetic data, demonstrate that a directed exploration strategy quicklyleads to users being presented improved rankings in an online learningsetting. We find that active exploration substantially outperformspassive observation and random exploration.

[1]  Norbert Fuhr,et al.  Optimum polynomial retrieval functions based on the probability ranking principle , 1989, TOIS.

[2]  R. A. Bradley,et al.  RANK ANALYSIS OF INCOMPLETE BLOCK DESIGNS THE METHOD OF PAIRED COMPARISONS , 1952 .

[3]  Foster J. Provost,et al.  Active Sampling for Class Probability Estimation and Ranking , 2004, Machine Learning.

[4]  Shane T. Jensen,et al.  Adaptive Paired Comparison Design , 2005 .

[5]  Javed A. Aslam,et al.  An Active Learning Approach to Ecien tly Ranking Retrieval Engines , 2003 .

[6]  Thomas A Louis,et al.  Loss Function Based Ranking in Two-Stage, Hierarchical Models. , 2006, Bayesian analysis.

[7]  M. Glickman Parameter Estimation in Large Dynamic Paired Comparison Experiments , 1999 .

[8]  Tom Minka,et al.  TrueSkillTM: A Bayesian Skill Rating System , 2006, NIPS.

[9]  David Hawking,et al.  Overview of the TREC-2001 Web track , 2002 .

[10]  D. Hunter MM algorithms for generalized Bradley-Terry models , 2003 .

[11]  H. A. David,et al.  The method of paired comparisons , 1966 .

[12]  Klaus Brinker,et al.  Active learning of label ranking functions , 2004, ICML.

[13]  Filip Radlinski,et al.  Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search , 2007, TOIS.

[14]  Mark E. Glickman,et al.  Bayesian locally optimal design of knockout tournaments , 2008 .

[15]  R. A. Bradley,et al.  RANK ANALYSIS OF INCOMPLETE BLOCK DESIGNS , 1952 .

[16]  Susan T. Dumais,et al.  Learning user interaction models for predicting web search result preferences , 2006, SIGIR.

[17]  Sandeep Pandey,et al.  Shuffling a Stacked Deck: The Case for Partially Randomized Ranking of Search Engine Results , 2005, VLDB.

[18]  James Allan,et al.  Minimal test collections for retrieval evaluation , 2006, SIGIR.

[19]  Yoram Singer,et al.  Log-Linear Models for Label Ranking , 2003, NIPS.

[20]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[21]  Thomas Hofmann,et al.  TrueSkill™: A Bayesian Skill Rating System , 2007 .

[22]  Filip Radlinski,et al.  Minimally Invasive Randomization for Collecting Unbiased Preferences from Clickthrough Logs , 2006, AAAI 2006.

[23]  Daphne Koller,et al.  Making Rational Decisions Using Adaptive Utility Elicitation , 2000, AAAI/IAAI.

[24]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[25]  Wei Chu,et al.  Extensions of Gaussian Processes for Ranking : Semi-supervised and Active Learning , 2005 .

[26]  Thorsten Joachims,et al.  Eye-tracking analysis of user behavior in WWW search , 2004, SIGIR '04.

[27]  Dmitry Ryvkin The Predictive Power of Noisy Elimination Tournaments , 2005 .

[28]  R. A. Bradley,et al.  Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons , 1952 .

[29]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[30]  Filip Radlinski,et al.  Query chains: learning to rank from implicit feedback , 2005, KDD '05.

[31]  Wei Chu,et al.  Preference learning with Gaussian processes , 2005, ICML.