An efficient algorithm for learning to rank from preference graphs

In this paper, we introduce a framework for regularized least-squares (RLS) type of ranking cost functions and we propose three such cost functions. Further, we propose a kernel-based preference learning algorithm, which we call RankRLS, for minimizing these functions. It is shown that RankRLS has many computational advantages compared to the ranking algorithms that are based on minimizing other types of costs, such as the hinge cost. In particular, we present efficient algorithms for training, parameter selection, multiple output learning, cross-validation, and large-scale learning. Circumstances under which these computational benefits make RankRLS preferable to RankSVM are considered. We evaluate RankRLS on four different types of ranking tasks using RankSVM and the standard RLS regression as the baselines. RankRLS outperforms the standard RLS regression and its performance is very similar to that of RankSVM, while RankRLS has several computational benefits over RankSVM.

[1]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[2]  Carl E. Rasmussen,et al.  The Need for Open Source Software in Machine Learning , 2007, J. Mach. Learn. Res..

[3]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[4]  Peter Auer,et al.  Proceedings of the 18th annual conference on Learning Theory , 2005 .

[5]  Klaus Obermayer,et al.  Support vector learning for ordinal regression , 1999 .

[6]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[7]  R. Brualdi,et al.  Combinatorial Matrix Theory: Some Special Graphs , 1991 .

[8]  Tapio Salakoski,et al.  Graph Kernels versus Graph Representations : a Case Study in Parse Ranking , 2006 .

[9]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[10]  Mehryar Mohri,et al.  Magnitude-preserving ranking algorithms , 2007, ICML '07.

[11]  J. Weston,et al.  Approximation Methods for Gaussian Process Regression , 2007 .

[12]  J. Shewchuk An Introduction to the Conjugate Gradient Method Without the Agonizing Pain , 1994 .

[13]  Tapio Salakoski,et al.  EXACT AND EFFICIENT LEAVE-PAIR-OUT CROSS-VALIDATION FOR RANKING RLS , 2008 .

[14]  Sayan Mukherjee,et al.  Permutation Tests for Classification , 2005, COLT.

[15]  Bernhard Schölkopf,et al.  A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[16]  Shivani Agarwal,et al.  Stability and Generalization of Bipartite Ranking Algorithms , 2005, COLT.

[17]  Shivani Agarwal,et al.  Ranking on graph data , 2006, ICML.

[18]  Alain Rakotomamonjy,et al.  Optimizing Area Under Roc Curve with SVMs , 2004, ROCAI.

[19]  Eyke Hüllermeier,et al.  Preference Learning , 2005, Künstliche Intell..

[20]  Eyke Hllermeier,et al.  Preference Learning , 2010 .

[21]  Ulf Brefeld,et al.  {AUC} maximizing support vector learning , 2005 .

[22]  Mehryar Mohri,et al.  An Alternative Ranking Problem for Search Engines , 2007, WEA.

[23]  T. Salakoski,et al.  Learning to Rank with Pairwise Regularized Least-Squares , 2007 .

[24]  Gábor Lugosi,et al.  Ranking and Scoring Using Empirical Risk Minimization , 2005, COLT.

[25]  Johan A. K. Suykens,et al.  Advances in learning theory : methods, models and applications , 2003 .

[26]  K. Johana,et al.  Benchmarking Least Squares Support Vector Machine Classifiers , 2022 .

[27]  Tapio Salakoski,et al.  Fast n-Fold Cross-Validation for Regularized Least-Squares , 2006 .

[28]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[29]  Jason Weston,et al.  Large-scale kernel machines , 2007 .

[30]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[31]  Theo Tryfonas,et al.  Frontiers in Artificial Intelligence and Applications , 2009 .

[32]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[33]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[34]  Jing Peng,et al.  SVM vs regularized least squares classification , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[35]  Tong Zhang,et al.  Graph-Based Semi-Supervised Learning and Spectral Kernel Design , 2008, IEEE Transactions on Information Theory.

[36]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[37]  Tao Qin,et al.  LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval , 2007 .

[38]  Daniel Dominic Sleator,et al.  Parsing English with a Link Grammar , 1995, IWPT.

[39]  Tapio Salakoski,et al.  Regularized Least-Squares for Parse Ranking , 2005, IDA.

[40]  Tapio Salakoski,et al.  Transductive Ranking via Pairwise Regularized Least-Squares , 2007 .

[41]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[42]  Kenneth Y. Goldberg,et al.  Eigentaste: A Constant Time Collaborative Filtering Algorithm , 2001, Information Retrieval.

[43]  Charles X. Ling,et al.  Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.

[44]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.

[45]  Tapio Salakoski,et al.  Evaluation of two dependency parsers on biomedical corpus targeted at protein-protein interactions , 2006, Int. J. Medical Informatics.

[46]  Simon Günter,et al.  Stratification bias in low signal microarray studies , 2007, BMC Bioinformatics.

[47]  Ryan M. Rifkin,et al.  Value Regularization and Fenchel Duality , 2007, J. Mach. Learn. Res..

[48]  Bernhard Schölkopf,et al.  Sparse Greedy Matrix Approximation for Machine Learning , 2000, International Conference on Machine Learning.

[49]  Tapio Salakoski,et al.  Efficient AUC Maximization with Regularized Least-Squares , 2008, SCAI.

[50]  R. Rifkin,et al.  Notes on Regularized Least Squares , 2007 .

[51]  Tomaso Poggio,et al.  Everything old is new again: a fresh look at historical approaches in machine learning , 2002 .

[52]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[53]  Jari Björne,et al.  BioInfer: a corpus for information extraction in the biomedical domain , 2007, BMC Bioinformatics.

[54]  Tapio Salakoski,et al.  A Sparse Regularized Least-Squares Preference Learning Algorithm , 2008, SCAI.

[55]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .