When All We Need is a Piece of the Pie: A Generic Framework for Optimizing Two-way Partial AUC

The Area Under the ROC Curve (AUC) is a crucial metric for machine learning, which evaluates the average performance over all possible True Positive Rates (TPRs) and False Positive Rates (FPRs). Based on the knowledge that a skillful classifier should simultaneously embrace a high TPR and a low FPR, we turn to study a more general variant called Two-way Partial AUC (TPAUC), where only the region with TPR ≥ α,FPR ≤ β is included in the area. Moreover, a recent work shows that the TPAUC is essentially inconsistent with the existing Partial AUC metrics where only the FPR range is restricted, opening a new problem to seek solutions to leverage high TPAUC. Motivated by this, we present the first trial in this paper to optimize this new metric. The critical challenge along this course lies in the difficulty of performing gradient-based optimization with end-to-end stochastic training, even with a proper choice of surrogate loss. To address this issue, we propose a generic framework to construct surrogate optimization problems, which supports efficient end-to-end training with deep-learning. Moreover, our theoretical analyses show that: 1) the objective function of the surrogate problems will achieve an upper bound of the original problem under mild conditions, and 2) optimizing the surrogate problems leads to good generalization performance in terms of TPAUC with a high probability. Finally, empirical studies over several benchmark datasets speak to the efficacy of our State Key Laboratory of Info. Security (SKLOIS), Inst. of Info. Engin., CAS, Beijing, China. School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China. Key Lab of Intell. Info. Process., Inst. of Comput. Tech., CAS, Beijing, China. Alibaba Group, Beijing, China. School of Computer Science and Tech., University of Chinese Academy of Sciences, Beijing, China. Peng Cheng Laboratory, Shenzhen, China. BDKM, University of Chinese Academy of Sciences, Beijing, China. Correspondence to: Qianqian Xu <xuqianqian@ict.ac.cn>, Qingming Huang <qmhuang@ucas.ac.cn>. Proceedings of the 38 th International Conference on Machine Learning, PMLR 139, 2021. Copyright 2021 by the author(s). framework.

[1]  G. Lugosi,et al.  Ranking and empirical minimization of U-statistics , 2006, math/0603123.

[2]  Jun Zhou,et al.  A Semi-Supervised Graph Attentive Network for Financial Fraud Detection , 2019, 2019 IEEE International Conference on Data Mining (ICDM).

[3]  Dan Roth,et al.  Generalization Bounds for the Area Under the ROC Curve , 2005, J. Mach. Learn. Res..

[4]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.

[5]  Xiaoming Yuan,et al.  A Generic First-Order Algorithmic Framework for Bi-Level Programming Beyond Lower-Level Singleton , 2020, ICML.

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[8]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[9]  Harikrishna Narasimhan,et al.  A Structural SVM Based Approach for Optimizing Partial AUC , 2013, ICML.

[10]  Harikrishna Narasimhan,et al.  Support Vector Algorithms for Optimizing the Partial Area under the ROC Curve , 2016, Neural Computation.

[11]  Yang Song,et al.  Class-Balanced Loss Based on Effective Number of Samples , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Bhavani Raskutti,et al.  Optimising area under the ROC curve using gradient descent , 2004, ICML.

[13]  Massih-Reza Amini,et al.  Generalization error bounds for classifiers trained with interdependent data , 2005, NIPS.

[14]  Siwei Lyu,et al.  Stochastic Proximal Algorithms for AUC Maximization , 2018, ICML.

[15]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Zhi-Hua Zhou,et al.  One-Pass AUC Optimization , 2013, ICML.

[17]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[18]  Siwei Lyu,et al.  Stochastic Online AUC Maximization , 2016, NIPS.

[19]  Shenghua Gao,et al.  Future Frame Prediction for Anomaly Detection - A New Baseline , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Shenghua Gao,et al.  Sparse-Gan: Sparsity-Constrained Generative Adversarial Network for Anomaly Detection in Retinal OCT Image , 2019, 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI).

[21]  Rong Jin,et al.  Online AUC Maximization , 2011, ICML.

[22]  Hanfang Yang,et al.  Two-way partial AUC and its properties , 2015, Statistical methods in medical research.

[23]  Mehryar Mohri,et al.  AUC Optimization vs. Error Rate Minimization , 2003, NIPS.

[24]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[25]  Zhi-Hua Zhou,et al.  On the Consistency of AUC Pairwise Optimization , 2012, IJCAI.

[26]  P. Gallinari,et al.  A Data-dependent Generalisation Error Bound for the AUC , 2005 .

[27]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[28]  Xinhua Zhang,et al.  Smoothing multivariate performance measures , 2011, J. Mach. Learn. Res..

[29]  Siwei Lyu,et al.  Stochastic AUC Optimization Algorithms With Linear Convergence , 2019, Front. Appl. Math. Stat..

[30]  Shivani Agarwal,et al.  Surrogate regret bounds for bipartite ranking via strongly proper losses , 2012, J. Mach. Learn. Res..

[31]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[32]  Li Sun,et al.  Fraud Transactions Detection via Behavior Tree with Local Intention Calibration , 2020, KDD.

[33]  Yuxi Zhang,et al.  Bi-level Probabilistic Feature Learning for Deformable Image Registration , 2020, IJCAI.

[34]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[35]  Harikrishna Narasimhan,et al.  SVMpAUCtight: a new support vector method for optimizing partial AUC based on a tight convex upper bound , 2013, KDD.

[36]  Ameet Talwalkar,et al.  Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[37]  Huazhu Fu,et al.  Open-Narrow-Synechiae Anterior Chamber Angle Classification in AS-OCT Sequences , 2020, ArXiv.

[38]  Xiangnan He,et al.  Mining Unfollow Behavior in Large-Scale Online Social Networks via Spatial-Temporal Interaction , 2019, AAAI.

[39]  Szymon Jaroszewicz,et al.  Efficient AUC Optimization for Classification , 2007, PKDD.

[40]  L. Ralaivola,et al.  Chromatic PAC-Bayes Bounds for Non-IID Data: Applications to Ranking and Stationary β-Mixing Processes , 2010 .