EGRank: An exponentiated gradient algorithm for sparse learning-to-rank

Abstract This paper focuses on the problem of sparse learning-to-rank , where the learned ranking models usually have very few non-zero coefficients. An exponential gradient algorithm is proposed to learn sparse models for learning-to-rank, which can be formulated as a convex optimization problem with the l 1 constraint. Our proposed algorithm has a competitive theoretical worst-case performance guarantee, that is, we can obtain an ϵ-accurate solution after O ( 1 ϵ ) iterations. An early stopping criterion based on Fenchel duality is proposed to make the algorithm be more efficient in practice. Extensive experiments are conducted on some benchmark datasets. The results demonstrate that a sparse ranking model can significantly improve the accuracy of ranking prediction compared to dense models, and the proposed algorithm shows stable and competitive performance compared to several state-of-the-art baseline algorithms.

[1]  Chen Chen,et al.  One-Two-One Networks for Compression Artifacts Reduction in Remote Sensing , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[2]  Yaakov Tsaig,et al.  Fast Solution of $\ell _{1}$ -Norm Minimization Problems When the Solution May Be Sparse , 2008, IEEE Transactions on Information Theory.

[3]  Qiang Wu,et al.  McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.

[4]  Tie-Yan Liu Learning to Rank for Information Retrieval , 2009, Found. Trends Inf. Retr..

[5]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[6]  Yong Tang,et al.  FSMRank: Feature Selection Algorithm for Learning to Rank , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[7]  Ben Taskar,et al.  Exponentiated Gradient Algorithms for Large-margin Structured Classification , 2004, NIPS.

[8]  S. Sathiya Keerthi,et al.  Efficient algorithms for ranking with SVMs , 2010, Information Retrieval.

[9]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[10]  Manish Aggarwal,et al.  On learning of weights through preferences , 2015, Inf. Sci..

[11]  Ryan M. Rifkin,et al.  Value Regularization and Fenchel Duality , 2007, J. Mach. Learn. Res..

[12]  Yue Wu,et al.  Learning to diversify web search results with a Document Repulsion Model , 2017, Inf. Sci..

[13]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[14]  Chen Chen,et al.  Gabor Convolutional Networks , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[15]  Peter L. Bartlett,et al.  Exponentiated Gradient Algorithms for Conditional Random Fields and Max-Margin Markov Networks , 2008, J. Mach. Learn. Res..

[16]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[17]  Hanjiang Lai,et al.  Instance-Aware Hashing for Multi-Label Image Retrieval , 2016, IEEE Transactions on Image Processing.

[18]  Klaus Obermayer,et al.  Support vector learning for ordinal regression , 1999 .

[19]  Yuqing He,et al.  Semi-supervised LPP algorithms for learning-to-rank-based visual search reranking , 2015, Inf. Sci..

[20]  Tie-Yan Liu,et al.  Listwise approach to learning to rank: theory and algorithm , 2008, ICML '08.

[21]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[22]  Filip Radlinski,et al.  A support vector method for optimizing average precision , 2007, SIGIR.

[23]  Hang Li,et al.  AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[24]  Manfred K. Warmuth,et al.  Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[25]  Tao Qin,et al.  Robust sparse rank learning for non-smooth ranking measures , 2009, SIGIR 2009.

[26]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[27]  Yong Tang,et al.  Efficient gradient descent algorithm for sparse models with application in learning-to-rank , 2013, Knowl. Based Syst..

[28]  Yoram Singer,et al.  On the equivalence of weak learnability and linear separability: new relaxations and efficient boosting algorithms , 2010, Machine Learning.

[29]  Jie Wu,et al.  Sparse Learning-to-Rank via an Efficient Primal-Dual Algorithm , 2013, IEEE Transactions on Computers.