论文信息 - A Large Deviation Bound for the Area Under the ROC Curve

A Large Deviation Bound for the Area Under the ROC Curve

The area under the ROC curve (AUC) has been advocated as an evaluation criterion for the bipartite ranking problem. We study large deviation properties of the AUC; in particular, we derive a distribution-free large deviation bound for the AUC which serves to bound the expected accuracy of a ranking function in terms of its empirical AUC on an independent test sequence. A comparison of our result with a corresponding large deviation result for the classification error rate suggests that the test sample size required to obtain an ∊-accurate estimate of the expected accuracy of a ranking function with δ-confidence is larger than that required to obtain an ∊-accurate estimate of the expected error rate of a classification function with the same confidence. A simple application of the union bound allows the large deviation bound to be extended to learned ranking functions chosen from finite function classes.

[1] Thore Graepel,et al. Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[2] Dan Roth,et al. A Uniform Convergence Bound for the Area Under the ROC Curve , 2005, AISTATS.

[3] Ralf Herbrich,et al. Large margin rank boundaries for ordinal regression , 2000 .

[4] Yoram Singer,et al. An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[5] Koby Crammer,et al. Pranking with Ranking , 2001, NIPS.

[6] Mehryar Mohri,et al. AUC Optimization vs. Error Rate Minimization , 2003, NIPS.

[7] Michael C. Mozer,et al. Optimizing Classifier Performance via an Approximation to the Wilcoxon-Mann-Whitney Statistic , 2003, ICML.

[8] Yoram Singer,et al. Learning to Order Things , 1997, NIPS.

[9] Colin McDiarmid,et al. Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .

[10] R. Herbrich,et al. Average Precision and the Problem of Generalisation , 2002 .

[11] E. Giné,et al. Decoupling: From Dependence to Independence , 1998 .

[12] László Györfi,et al. A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[13] Dan Roth,et al. Generalization Bounds for the Area Under the ROC Curve , 2005, J. Mach. Learn. Res..