Bipartite Ranking through Minimization of Univariate Loss

Minimization of the rank loss or, equivalently, maximization of the AUC in bipartite ranking calls for minimizing the number of disagreements between pairs of instances. Since the complexity of this problem is inherently quadratic in the number of training examples, it is tempting to ask how much is actually lost by minimizing a simple univariate loss function, as done by standard classification methods, as a surrogate. In this paper, we first note that minimization of 0/1 loss is not an option, as it may yield an arbitrarily high rank loss. We show, however, that better results can be achieved by means of a weighted (cost-sensitive) version of 0/1 loss. Yet, the real gain is obtained through margin-based loss functions, for which we are able to derive proper bounds, not only for rank risk but, more importantly, also for rank regret. The paper is completed with an experimental study in which we address specific questions raised by our theoretical analysis.

[1]  G. Lugosi,et al.  Ranking and empirical minimization of U-statistics , 2006, math/0603123.

[2]  Klaus Obermayer,et al.  Regression Models for Ordinal Data: A Machine Learning Approach , 1999 .

[3]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[4]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[5]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.

[6]  Anonymous Author Robust Reductions from Ranking to Classification , 2006 .

[7]  Harald Steck,et al.  Hinge Rank Loss and the Area Under the ROC Curve , 2007, ECML.

[8]  R. L. Bradshaw,et al.  RESULTS AND ANALYSIS. , 1971 .

[9]  Cynthia Rudin,et al.  Margin-Based Ranking Meets Boosting in the Middle , 2005, COLT.

[10]  Yoram Singer,et al.  Learning to Order Things , 1997, NIPS.

[11]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[12]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[13]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[14]  Thorsten Joachims,et al.  KDD-Cup 2004: results and analysis , 2004, SKDD.

[15]  Michael I. Jordan,et al.  On the Consistency of Ranking Algorithms , 2010, ICML.

[16]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[17]  Mehryar Mohri,et al.  AUC Optimization vs. Error Rate Minimization , 2003, NIPS.

[18]  Dan Roth,et al.  Generalization Bounds for the Area Under the ROC Curve , 2005, J. Mach. Learn. Res..

[19]  Mehryar Mohri,et al.  An Efficient Reduction of Ranking to Classification , 2007, COLT.