SOLAR: Scalable Online Learning Algorithms for Ranking

Traditional learning to rank methods learn ranking models from training data in a batch and offline learning mode, which suffers from some critical limitations, e.g., poor scalability as the model has to be retrained from scratch whenever new training data arrives. This is clearly nonscalable for many real applications in practice where training data often arrives sequentially and frequently. To overcome the limitations, this paper presents SOLAR — a new framework of Scalable Online Learning Algorithms for Ranking, to tackle the challenge of scalable learning to rank. Specifically, we propose two novel SOLAR algorithms and analyze their IR measure bounds theoretically. We conduct extensive empirical studies by comparing our SOLAR algorithms with conventional learning to rank algorithms on benchmark testbeds, in which promising results validate the efficacy and scalability of the proposed novel SOLAR algorithms.

[1]  Katja Hofmann,et al.  Fast and reliable online learning to rank for information retrieval , 2013, SIGIR Forum.

[2]  Tie-Yan Liu,et al.  Listwise approach to learning to rank: theory and algorithm , 2008, ICML '08.

[3]  Stephen E. Robertson,et al.  SoftRank: optimizing non-smooth rank metrics , 2008, WSDM '08.

[4]  Koby Crammer,et al.  Confidence-weighted linear classification , 2008, ICML '08.

[5]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[6]  Hang Li Learning to Rank for Information Retrieval and Natural Language Processing , 2011, Synthesis Lectures on Human Language Technologies.

[7]  Claudio Gentile,et al.  A Second-Order Perceptron Algorithm , 2002, SIAM J. Comput..

[8]  Marti A. Hearst Chapter 2 of the second edition of Modern Information Retrieval Renamed Modern Information Retrieval : The Concepts and Technology behind Search , 2011 .

[9]  Mihai Surdeanu,et al.  Learning to Rank Answers on Large Online QA Collections , 2008, ACL.

[10]  Tao Qin,et al.  FRank: a ranking method with fidelity loss , 2007, SIGIR.

[11]  Fredric C. Gey,et al.  Probabilistic retrieval based on staged logistic regression , 1992, SIGIR '92.

[12]  Fredric C. Gey,et al.  Inferring probability of relevance using the method of logistic regression , 1994, SIGIR '94.

[13]  Jaime G. Carbonell,et al.  Fast learning of document ranking functions with the committee perceptron , 2008, WSDM '08.

[14]  S. Sathiya Keerthi,et al.  Efficient algorithms for ranking with SVMs , 2010, Information Retrieval.

[15]  Katja Hofmann,et al.  Information Retrieval manuscript No. (will be inserted by the editor) Balancing Exploration and Exploitation in Listwise and Pairwise Online Learning to Rank for Information Retrieval , 2022 .

[16]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[17]  Koby Crammer,et al.  Pranking with Ranking , 2001, NIPS.

[18]  Filip Radlinski,et al.  A support vector method for optimizing average precision , 2007, SIGIR.

[19]  Ramesh Nallapati,et al.  Discriminative models for information retrieval , 2004, SIGIR '04.

[20]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[21]  Steven C. H. Hoi,et al.  LIBOL: a library for online learning algorithms , 2014, J. Mach. Learn. Res..

[22]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[23]  Thorsten Joachims,et al.  The K-armed Dueling Bandits Problem , 2012, COLT.

[24]  Qiang Wu,et al.  McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.

[25]  Quoc V. Le,et al.  Learning to Rank with Nonsmooth Cost Functions , 2006, Neural Information Processing Systems.

[26]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[27]  Hang Li,et al.  AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[28]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[29]  Rong Jin,et al.  Learning to Rank by Optimizing NDCG Measure , 2009, NIPS.

[30]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[31]  Koby Crammer,et al.  Adaptive regularization of weight vectors , 2009, Machine Learning.

[32]  Samy Bengio,et al.  Large Scale Online Learning of Image Similarity Through Ranking , 2009, J. Mach. Learn. Res..

[33]  Tie-Yan Liu,et al.  Adapting ranking SVM to document retrieval , 2006, SIGIR.

[34]  Tao Qin,et al.  LETOR: A benchmark collection for research on learning to rank for information retrieval , 2010, Information Retrieval.

[35]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[36]  Tie-Yan Liu,et al.  Future directions in learning to rank , 2010, Yahoo! Learning to Rank Challenge.

[37]  Katja Hofmann,et al.  Reusing historical interaction data for faster online learning to rank for IR , 2013, DIR.

[38]  Tie-Yan Liu,et al.  Directly optimizing evaluation measures in learning to rank , 2008, SIGIR.

[39]  Hongyuan Zha,et al.  A regression framework for learning ranking functions using relative relevance judgments , 2007, SIGIR.

[40]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[41]  Yi Chang,et al.  Yahoo! Learning to Rank Challenge Overview , 2010, Yahoo! Learning to Rank Challenge.

[42]  Tie-Yan Liu,et al.  Ranking Measures and Loss Functions in Learning to Rank , 2009, NIPS.

[43]  Ricardo Baeza-Yates,et al.  Modern Information Retrieval - the concepts and technology behind search, Second edition , 2011 .

[44]  Tom Heskes,et al.  Large Scale Co-Regularized Ranking , 2012 .