A Distributionally Robust Area Under Curve Maximization Model

Abstract Area under ROC curve (AUC) is a performance measure for classification models. We propose new distributionally robust AUC models (DR-AUC) that rely on the Kantorovich metric and approximate AUC with the hinge loss function, and derive convex reformulations using duality. The DR-AUC models outperform deterministic AUC and support vector machine models and have superior worst-case out-of-sample performance, thereby showing their robustness. The results are encouraging since the numerical experiments are conducted with small-size training sets conducive to low out-of-sample performance.

[1]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[2]  Herbert E. Scarf,et al.  A Min-Max Solution of an Inventory Problem , 1957 .

[3]  Michael C. Mozer,et al.  Optimizing Classifier Performance via an Approximation to the Wilcoxon-Mann-Whitney Statistic , 2003, ICML.

[4]  Daniel Kuhn,et al.  Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations , 2015, Mathematical Programming.

[5]  I. Paschalidis,et al.  A Robust Learning Algorithm for Regression Models Using Distributionally Robust Optimization under the Wasserstein Metric , 2017, 1706.02412.

[6]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[7]  Stan Uryasev,et al.  Maximization of AUC and Buffered AUC in binary classification , 2018, Mathematical Programming.

[8]  He Zhang,et al.  Models and algorithms for distributionally robust least squares problems , 2013, Mathematical Programming.

[9]  Viet Anh Nguyen,et al.  Wasserstein Distributionally Robust Optimization: Theory and Applications in Machine Learning , 2019, Operations Research & Management Science in the Age of Analytics.

[10]  Ulf Brefeld,et al.  {AUC} maximizing support vector learning , 2005 .

[11]  Daniel Kuhn,et al.  Distributionally Robust Logistic Regression , 2015, NIPS.

[12]  Mehryar Mohri,et al.  AUC Optimization vs. Error Rate Minimization , 2003, NIPS.

[13]  Daniel Kuhn,et al.  Regularization via Mass Transportation , 2017, J. Mach. Learn. Res..

[14]  Charles X. Ling,et al.  AUC: A Better Measure than Accuracy in Comparing Learning Algorithms , 2003, Canadian Conference on AI.

[15]  Sanjay Mehrotra,et al.  Decomposition Algorithm for Distributionally Robust Optimization using Wasserstein Metric , 2017, 1704.03920.

[16]  Sanjay Mehrotra,et al.  Distributionally Robust Optimization: A Review , 2019, ArXiv.

[17]  Pierre Bertrand,et al.  Models and Algorithms , 2018 .

[18]  Yinyu Ye,et al.  Distributionally Robust Optimization Under Moment Uncertainty with Application to Data-Driven Problems , 2010, Oper. Res..

[19]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[20]  Miguel Lejeune,et al.  Data-Driven Optimization of Reward-Risk Ratio Measures , 2020, INFORMS J. Comput..

[21]  Alain Rakotomamonjy,et al.  Optimizing Area Under Roc Curve with SVMs , 2004, ROCAI.