The Adversarial Consistency of Surrogate Risks for Binary Classification

We study the consistency of surrogate risks for robust binary classification. It is common to learn robust classifiers by adversarial training, which seeks to minimize the expected $0$-$1$ loss when each example can be maliciously corrupted within a small ball. We give a simple and complete characterization of the set of surrogate loss functions that are \emph{consistent}, i.e., that can replace the $0$-$1$ loss without affecting the minimizing sequences of the original adversarial risk, for any data distribution. We also prove a quantitative version of adversarial consistency for the $\rho$-margin loss. Our results reveal that the class of adversarially consistent surrogates is substantially smaller than in the standard setting, where many common surrogates are known to be consistent.

[1]  Natalie Frank Existence and Minimax Theorems for Adversarial Surrogate Risks in Binary Classification , 2022, ArXiv.

[2]  Y. Chevaleyre,et al.  Towards Consistency in Adversarial Classification , 2022, Neural Information Processing Systems.

[3]  M. Jacobs,et al.  The Multimarginal Optimal Transport Formulation of Adversarial Multiclass Classification , 2022, J. Mach. Learn. Res..

[4]  Muni Sreenivas Pydi The Many Faces of Adversarial Risk: An Expanded Study , 2022, IEEE Transactions on Information Theory.

[5]  Mehryar Mohri,et al.  On the Existence of the Adversarial Bayes Classifier (Extended Version) , 2021, NeurIPS.

[6]  Leon Bungert,et al.  The Geometry of Adversarial Training in Binary Classification , 2021, ArXiv.

[7]  Mehryar Mohri,et al.  A Finer Calibration Analysis for Adversarial Robustness , 2021, ArXiv.

[8]  Mehryar Mohri,et al.  Calibration and Consistency of Adversarial Surrogate Losses , 2021, NeurIPS.

[9]  Yujie Li,et al.  Adaptive Square Attack: Fooling Autonomous Cars With Adversarial Traffic Signs , 2021, IEEE Internet of Things Journal.

[10]  Kamalika Chaudhuri,et al.  Consistent Non-Parametric Methods for Maximizing Robustness , 2021, NeurIPS.

[11]  Shivani Agarwal,et al.  Bayes Consistency vs. H-Consistency: The Interplay between Surrogate Loss Functions and the Scoring Function Class , 2020, Neural Information Processing Systems.

[12]  Ryan W. Murray,et al.  Adversarial Classification: Necessary conditions and geometric flows , 2020, J. Mach. Learn. Res..

[13]  Masashi Sugiyama,et al.  Calibrated Surrogate Losses for Adversarially Robust Classification , 2020, COLT.

[14]  Kamalika Chaudhuri,et al.  When are Non-Parametric Methods Robust? , 2020, ICML.

[15]  Nassir Navab,et al.  Generalizability vs. Robustness: Adversarial Examples for Medical Imaging , 2018, ArXiv.

[16]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[17]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[18]  Philip M. Long,et al.  Consistency versus Realizable H-Consistency for Multiclass Classification , 2013, ICML.

[19]  Mark D. Reid,et al.  Surrogate regret bounds for proper losses , 2009, ICML '09.

[20]  Ingo Steinwart How to Compare Different Loss Functions and Their Risks , 2007 .

[21]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[22]  Tong Zhang Statistical behavior and consistency of classification methods based on convex risk minimization , 2003 .

[23]  Shai Ben-David,et al.  On the difficulty of approximately maximizing agreements , 2000, J. Comput. Syst. Sci..

[24]  Gerald B. Folland,et al.  Real Analysis: Modern Techniques and Their Applications , 1984 .

[25]  M. Mohri,et al.  H-Consistency Bounds for Surrogate Loss Minimizers , 2022, International Conference on Machine Learning.

[26]  Yi Lin A note on margin-based loss functions in classification , 2004 .