Existence and Minimax Theorems for Adversarial Surrogate Risks in Binary Classification

Adversarial training is one of the most popular methods for training methods robust to adversarial attacks, however, it is not well-understood from a theoretical perspective. We prove and existence, regularity, and minimax theorems for adversarial surrogate risks. Our results explain some empirical observations on adversarial robustness from prior work and suggest new directions in algorithm development. Furthermore, our results extend previously known existence and minimax theorems for the adversarial classification risk to surrogate risks.

[1]  Camilo A. Garcia Trillos,et al.  On adversarial robustness and the use of Wasserstein ascent-descent dynamics to enforce it , 2023, ArXiv.

[2]  Jonathan Niles-Weed,et al.  The Consistency of Adversarial Training for Binary Classification , 2022, ArXiv.

[3]  M. Jacobs,et al.  The Multimarginal Optimal Transport Formulation of Adversarial Multiclass Classification , 2022, J. Mach. Learn. Res..

[4]  Muni Sreenivas Pydi The Many Faces of Adversarial Risk: An Expanded Study , 2022, IEEE Transactions on Information Theory.

[5]  James Bailey,et al.  On the Convergence and Robustness of Adversarial Training , 2021, ICML.

[6]  Mehryar Mohri,et al.  On the Existence of the Adversarial Bayes Classifier (Extended Version) , 2021, NeurIPS.

[7]  Leon Bungert,et al.  The Geometry of Adversarial Training in Binary Classification , 2021, ArXiv.

[8]  Shivani Agarwal,et al.  Bayes Consistency vs. H-Consistency: The Interplay between Surrogate Loss Functions and the Scoring Function Class , 2020, Neural Information Processing Systems.

[9]  Ryan W. Murray,et al.  Adversarial Classification: Necessary conditions and geometric flows , 2020, J. Mach. Learn. Res..

[10]  Yaoliang Yu,et al.  Stronger and Faster Wasserstein Adversarial Attacks , 2020, ICML.

[11]  Muni Sreenivas Pydi,et al.  Adversarial Risk via Optimal Transport and Optimal Couplings , 2019, IEEE Transactions on Information Theory.

[12]  Daniel Cullina,et al.  Lower Bounds on Adversarial Robustness from Optimal Transport , 2019, NeurIPS.

[13]  Kamalika Chaudhuri,et al.  Robustness for Non-Parametric Classification: A Generic Attack and Defense , 2019, AISTATS.

[14]  Larry S. Davis,et al.  Adversarial Training for Free! , 2019, NeurIPS.

[15]  J. Zico Kolter,et al.  Wasserstein Adversarial Examples via Projected Sinkhorn Iterations , 2019, ICML.

[16]  Alan L. Yuille,et al.  Feature Denoising for Improving Adversarial Robustness , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Robustness via Curvature Regularization, and Vice Versa , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Y. Peres,et al.  Which domains have two-sided supporting unit spheres at every boundary point? , 2018, Expositiones Mathematicae.

[19]  Chenchen Liu,et al.  Interpreting Adversarial Robustness: A View from Decision Surface in Input Space , 2018, ArXiv.

[20]  Fabio Roli,et al.  Why Do Adversarial Attacks Transfer? Explaining Transferability of Evasion and Poisoning Attacks , 2018, USENIX Security Symposium.

[21]  Raja Giryes,et al.  Improving DNN Robustness to Adversarial Attacks using Jacobian Regularization , 2018, ECCV.

[22]  Harini Kannan,et al.  Adversarial Logit Pairing , 2018, NIPS 2018.

[23]  Xi Chen,et al.  Wasserstein Distributionally Robust Optimization and Variation Regularization , 2017, Operations Research.

[24]  Andrew Slavin Ross,et al.  Improving the Adversarial Robustness and Interpretability of Deep Neural Networks by Regularizing their Input Gradients , 2017, AAAI.

[25]  Brendan Dolan-Gavitt,et al.  BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain , 2017, ArXiv.

[26]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[27]  Dan Boneh,et al.  The Space of Transferable Adversarial Examples , 2017, ArXiv.

[28]  Enrico Valdinoci,et al.  Minimizers for nonlocal perimeters of Minkowski type , 2017, Calculus of Variations and Partial Differential Equations.

[29]  M. Novaga,et al.  Isoperimetric problems for a nonlocal perimeter of Minkowski type , 2017, 1709.05284.

[30]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[31]  Terrance E. Boult,et al.  Are Accuracy and Robustness Correlated , 2016, 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA).

[32]  Ananthram Swami,et al.  Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples , 2016, ArXiv.

[33]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[34]  Antonin Chambolle,et al.  Nonlocal Curvature Flows , 2014, Archive for Rational Mechanics and Analysis.

[35]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[36]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[37]  Philip M. Long,et al.  Consistency versus Realizable H-Consistency for Multiclass Classification , 2013, ICML.

[38]  Antonin Chambolle,et al.  A Nonlocal Mean Curvature Flow and Its Semi-implicit Time-Discrete Approximation , 2012, SIAM J. Math. Anal..

[39]  Ambuj Tewari,et al.  On the Consistency of Multiclass Classification Methods , 2007, J. Mach. Learn. Res..

[40]  Ingo Steinwart How to Compare Different Loss Functions and Their Risks , 2007 .

[41]  Dimitri P. Bertsekas,et al.  Stochastic optimal control : the discrete time case , 2007 .

[42]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[43]  C. Villani Topics in Optimal Transportation , 2003 .

[44]  Tong Zhang Statistical behavior and consistency of classification methods based on convex risk minimization , 2003 .

[45]  Shai Ben-David,et al.  On the difficulty of approximately maximizing agreements , 2000, J. Comput. Syst. Sci..

[46]  Gerald B. Folland,et al.  Real Analysis: Modern Techniques and Their Applications , 1984 .

[47]  V. Barbu,et al.  Convexity and optimization in banach spaces , 1972 .

[48]  Filippo Santambrogio,et al.  Optimal Transport for Applied Mathematicians , 2015 .

[49]  Bastian Goldlücke,et al.  Variational Analysis , 2014, Computer Vision, A Reference Guide.

[50]  Heikki Jylhä The L ∞ optimal transport: infinite cyclical monotonicity and the existence of optimal transport maps , 2014 .

[51]  Togo Nishiura,et al.  Absolute measurable spaces , 2008 .

[52]  G. Bellettini Anisotropic and Crystalline Mean Curvature Flow , 2004 .

[53]  Yi Lin A note on margin-based loss functions in classification , 2004 .

[54]  G. Sapiro,et al.  Geometric partial differential equations and image analysis [Book Reviews] , 2001, IEEE Transactions on Medical Imaging.