论文信息 - Sparse Learning-to-Rank via an Efficient Primal-Dual Algorithm - 字舞流文

Sparse Learning-to-Rank via an Efficient Primal-Dual Algorithm

Learning-to-rank for information retrieval has gained increasing interest in recent years. Inspired by the success of sparse models, we consider the problem of sparse learning-to-rank, where the learned ranking models are constrained to be with only a few nonzero coefficients. We begin by formulating the sparse learning-to-rank problem as a convex optimization problem with a sparse-inducing $(\ell_1)$ constraint. Since the $(\ell_1)$ constraint is nondifferentiable, the critical issue arising here is how to efficiently solve the optimization problem. To address this issue, we propose a learning algorithm from the primal dual perspective. Furthermore, we prove that, after at most $(O({1\over \epsilon } ))$ iterations, the proposed algorithm can guarantee the obtainment of an $(\epsilon)$-accurate solution. This convergence rate is better than that of the popular subgradient descent algorithm. i.e., $(O({1\over \epsilon^2} ))$. Empirical evaluation on several public benchmark data sets demonstrates the effectiveness of the proposed algorithm: 1) Compared to the methods that learn dense models, learning a ranking model with sparsity constraints significantly improves the ranking accuracies. 2) Compared to other methods for sparse learning-to-rank, the proposed algorithm tends to obtain sparser models and has superior performance gain on both ranking accuracies and training time. 3) Compared to several state-of-the-art algorithms, the ranking accuracies of the proposed algorithm are very competitive and stable.

Jie Wu | Hanjiang Lai | Liang Lin | Yan Pan | Cong Liu | Liang Lin | Jie Wu | Hanjiang Lai | Yan Pan | Cong Liu

[1] Chris Buckley,et al. OHSUMED: an interactive retrieval evaluation and new large test collection for research , 1994, SIGIR '94.

[2] Dimitri P. Bertsekas,et al. Nonlinear Programming , 1997 .

[3] Alexander J. Smola,et al. Support Vector Method for Function Approximation, Regression Estimation and Signal Processing , 1996, NIPS.

[4] J. Marden. Analyzing and Modeling Rank Data , 1996 .

[5] Thorsten Joachims,et al. Making large scale SVM learning practical , 1998 .

[6] Yoram Singer,et al. An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[7] Yoav Freund,et al. A Short Introduction to Boosting , 1999 .

[8] Adrian S. Lewis,et al. Convex Analysis And Nonlinear Optimization , 2000 .

[9] Jaana Kekäläinen,et al. Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[10] Thorsten Joachims,et al. Optimizing search engines using clickthrough data , 2002, KDD.

[11] Gregory N. Hullender,et al. Learning to rank using gradient descent , 2005, ICML.

[12] Thorsten Joachims,et al. Training linear SVMs in linear time , 2006, KDD '06.

[13] Tie-Yan Liu,et al. Adapting ranking SVM to document retrieval , 2006, SIGIR.

[14] Filip Radlinski,et al. A support vector method for optimizing average precision , 2007, SIGIR.

[15] Ryan M. Rifkin,et al. Value Regularization and Fenchel Duality , 2007, J. Mach. Learn. Res..

[16] Tao Qin,et al. Ranking with multiple hyperplanes , 2007, SIGIR.

[17] Hang Li,et al. AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[18] Tie-Yan Liu,et al. Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[19] Qiang Wu,et al. McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.

[20] Tao Qin,et al. LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval , 2007 .

[21] Kim,et al. A gradient-based optimization algorithm for LASSO , 2008 .

[22] Tie-Yan Liu,et al. Listwise approach to learning to rank: theory and algorithm , 2008, ICML '08.

[23] Stephen E. Robertson,et al. SoftRank: optimizing non-smooth rank metrics , 2008, WSDM '08.

[24] Yaakov Tsaig,et al. Fast Solution of $\ell _{1}$ -Norm Minimization Problems When the Solution May Be Sparse , 2008, IEEE Transactions on Information Theory.

[25] D. Donoho,et al. Fast Solution of -Norm Minimization Problems When the Solution May Be Sparse , 2008 .

[26] Yoram Singer,et al. Efficient projections onto the l1-ball for learning in high dimensions , 2008, ICML '08.

[27] Tao Qin,et al. Robust sparse rank learning for non-smooth ranking measures , 2009, SIGIR 2009.

[28] Jieping Ye,et al. Large-scale sparse logistic regression , 2009, KDD.

[29] Tie-Yan Liu,et al. Ranking Measures and Loss Functions in Learning to Rank , 2009, NIPS.

[30] Paul Tseng,et al. A coordinate gradient descent method for nonsmooth separable minimization , 2008, Math. Program..

[31] S. Sathiya Keerthi,et al. Efficient algorithms for ranking with SVMs , 2010, Information Retrieval.

[32] Yoram Singer,et al. On the equivalence of weak learnability and linear separability: new relaxations and efficient boosting algorithms , 2010, Machine Learning.

[33] Chih-Jen Lin,et al. A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification , 2010, J. Mach. Learn. Res..

[34] Julien Mairal,et al. Optimization with Sparsity-Inducing Penalties , 2011, Found. Trends Mach. Learn..