Distributional Robustness with IPMs and links to Regularization and GANs

Robustness to adversarial attacks is an important concern due to the fragility of deep neural networks to small perturbations and has received an abundance of attention in recent years. Distributionally Robust Optimization (DRO), a particularly promising way of addressing this challenge, studies robustness via divergence-based uncertainty sets and has provided valuable insights into robustification strategies such as regularization. In the context of machine learning, the majority of existing results have chosen $f$-divergences, Wasserstein distances and more recently, the Maximum Mean Discrepancy (MMD) to construct uncertainty sets. We extend this line of work for the purposes of understanding robustness via regularization by studying uncertainty sets constructed with Integral Probability Metrics (IPMs) - a large family of divergences including the MMD, Total Variation and Wasserstein distances. Our main result shows that DRO under \textit{any} choice of IPM corresponds to a family of regularization penalties, which recover and improve upon existing results in the setting of MMD and Wasserstein distances. Due to the generality of our result, we show that other choices of IPMs correspond to other commonly used penalties in machine learning. Furthermore, we extend our results to shed light on adversarial generative modelling via $f$-GANs, constituting the first study of distributional robustness for the $f$-GAN objective. Our results unveil the inductive properties of the discriminator set with regards to robustness, allowing us to give positive comments for several penalty-based GAN methods such as Wasserstein-, MMD- and Sobolev-GANs. In summary, our results intimately link GANs to distributional robustness, extend previous results on DRO and contribute to our understanding of the link between regularization and robustness at large.

[1]  Daniel Kuhn,et al.  Distributionally Robust Logistic Regression , 2015, NIPS.

[2]  Alexandros G. Dimakis,et al.  The Robust Manifold Defense: Adversarial Training using Generative Models , 2017, ArXiv.

[3]  Jungwoo Lee,et al.  Generative Adversarial Trainer: Defense to Adversarial Perturbations with GAN , 2017, ArXiv.

[4]  K Fan,et al.  Minimax Theorems. , 1953, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Yiming Yang,et al.  MMD GAN: Towards Deeper Understanding of Moment Matching Network , 2017, NIPS.

[6]  George Danezis,et al.  Learning Universal Adversarial Perturbations with Generative Models , 2017, 2018 IEEE Security and Privacy Workshops (SPW).

[7]  Andrew E. B. Lim,et al.  Robust Empirical Optimization is Almost the Same As Mean-Variance Optimization , 2015, Oper. Res. Lett..

[8]  John C. Duchi,et al.  Certifiable Distributional Robustness with Principled Adversarial Training , 2017, ArXiv.

[9]  Arthur Gretton,et al.  On gradient regularizers for MMD GANs , 2018, NeurIPS.

[10]  He Zhao,et al.  Perturbations are not Enough: Generating Adversarial Examples with Spatial Distortions , 2019, ArXiv.

[11]  A. Müller Integral Probability Metrics and Their Generating Classes of Functions , 1997, Advances in Applied Probability.

[12]  Radu State,et al.  SynGAN: Towards Generating Synthetic Network Attacks using GANs , 2019, ArXiv.

[13]  Chun-Nam Yu,et al.  A Direct Approach to Robust Deep Learning Using Adversarial Networks , 2019, ICLR.

[14]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[15]  Sameer Singh,et al.  Generating Natural Adversarial Examples , 2017, ICLR.

[16]  Yang Song,et al.  PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples , 2017, ICLR.

[17]  Tao Xu,et al.  On the Discrimination-Generalization Tradeoff in GANs , 2017, ICLR.

[18]  Augustus Odena,et al.  Open Questions about Generative Adversarial Networks , 2019, Distill.

[19]  Sebastian Nowozin,et al.  f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[20]  R. Rockafellar Integrals which are convex functionals. II , 1968 .

[21]  Yang Song,et al.  Constructing Unrestricted Adversarial Examples with Generative Models , 2018, NeurIPS.

[22]  Japhet Niyobuhungiro,et al.  Optimal decomposition for infimal convolution on Banach Couples , 2013 .

[23]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[24]  Martin J. Wainwright,et al.  Local privacy and statistical minimax rates , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[25]  Kamalika Chaudhuri,et al.  Approximation and Convergence Properties of Generative Adversarial Learning , 2017, NIPS.

[26]  Yu Cheng,et al.  Sobolev GAN , 2017, ICLR.

[27]  David Tse,et al.  A Convex Duality Framework for GANs , 2018, NeurIPS.

[28]  Lantao Yu,et al.  Lipschitz Generative Adversarial Nets , 2019, ICML.

[29]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[30]  Rama Chellappa,et al.  Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models , 2018, ICLR.

[31]  Yinyu Ye,et al.  Distributionally Robust Optimization Under Moment Uncertainty with Application to Data-Driven Problems , 2010, Oper. Res..

[32]  C. Villani Optimal Transport: Old and New , 2008 .

[33]  Isay Katsman,et al.  Generative Adversarial Perturbations , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  John C. Duchi,et al.  Variance-based Regularization with Convex Objectives , 2016, NIPS.

[35]  Bernhard Schölkopf,et al.  Kernel Mean Embedding of Distributions: A Review and Beyonds , 2016, Found. Trends Mach. Learn..

[36]  C. Zălinescu Convex analysis in general vector spaces , 2002 .

[37]  John Duchi,et al.  Statistics of Robust Optimization: A Generalized Empirical Likelihood Approach , 2016, Math. Oper. Res..

[38]  John C. Duchi,et al.  Certifying Some Distributional Robustness with Principled Adversarial Training , 2017, ICLR.

[39]  Stefanie Jegelka,et al.  Distributionally Robust Optimization and Generalization in Kernel Methods , 2019, NeurIPS.

[40]  Moustapha Cissé,et al.  Parseval Networks: Improving Robustness to Adversarial Examples , 2017, ICML.

[41]  Anja De Waegenaere,et al.  Robust Solutions of Optimization Problems Affected by Uncertain Probabilities , 2011, Manag. Sci..

[42]  Truyen Tran,et al.  Improving Generalization and Stability of Generative Adversarial Networks , 2019, ICLR.

[43]  Andrew M. Dai,et al.  Many Paths to Equilibrium: GANs Do Not Need to Decrease a Divergence At Every Step , 2017, ICLR.

[44]  Thomas Strömberg,et al.  A study of the operation of infimal convolution , 1994 .

[45]  Richard Nock,et al.  A Primal-Dual link between GANs and Autoencoders , 2019, NeurIPS.

[46]  Mingyan Liu,et al.  Generating Adversarial Examples with Adversarial Networks , 2018, IJCAI.

[47]  Daniel Kuhn,et al.  Regularization via Mass Transportation , 2017, J. Mach. Learn. Res..

[48]  Pushmeet Kohli,et al.  A Framework for robustness Certification of Smoothed Classifiers using F-Divergences , 2020, ICLR.

[49]  Tom Sercu,et al.  Fisher GAN , 2017, NIPS.

[50]  A. Kleywegt,et al.  Distributionally Robust Stochastic Optimization with Wasserstein Distance , 2016, Math. Oper. Res..

[51]  Herbert E. Scarf,et al.  A Min-Max Solution of an Inventory Problem , 1957 .

[52]  Richard Nock,et al.  Generalised Lipschitz Regularisation Equals Distributional Robustness , 2020, ICML.

[53]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[54]  Kenji Fukumizu,et al.  On integral probability metrics, φ-divergences and binary classification , 2009, 0901.2698.

[55]  Daniel Kuhn,et al.  Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations , 2015, Mathematical Programming.

[56]  Jun Zhou,et al.  Generalization in Generative Adversarial Networks: A Novel Perspective from Privacy Protection , 2019, NeurIPS.

[57]  M. KarthyekRajhaaA.,et al.  Robust Wasserstein profile inference and applications to machine learning , 2019, J. Appl. Probab..

[58]  Alexander Shapiro,et al.  Distributionally Robust Stochastic Programming , 2017, SIAM J. Optim..

[59]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[60]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[61]  Henry Lam,et al.  Robust Sensitivity Analysis for Stochastic Systems , 2013, Math. Oper. Res..

[62]  Kamalika Chaudhuri,et al.  The Inductive Bias of Restricted f-GANs , 2018, ArXiv.

[63]  J. Penot Calculus Without Derivatives , 2012 .

[64]  Arthur Gretton,et al.  Demystifying MMD GANs , 2018, ICLR.

[65]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.