Enhancing the Robustness of Deep Neural Networks by Boundary Conditional GAN

Deep neural networks have been widely deployed in various machine learning tasks. However, recent works have demonstrated that they are vulnerable to adversarial examples: carefully crafted small perturbations to cause misclassification by the network. In this work, we propose a novel defense mechanism called Boundary Conditional GAN to enhance the robustness of deep neural networks against adversarial examples. Boundary Conditional GAN, a modified version of Conditional GAN, can generate boundary samples with true labels near the decision boundary of a pre-trained classifier. These boundary samples are fed to the pre-trained classifier as data augmentation to make the decision boundary more robust. We empirically show that the model improved by our approach consistently defenses against various types of adversarial attacks successfully. Further quantitative investigations about the improvement of robustness and visualization of decision boundaries are also provided to justify the effectiveness of our strategy. This new defense mechanism that uses boundary samples to enhance the robustness of networks opens up a new way to defense adversarial attacks consistently.

[1]  Brian McWilliams,et al.  The Shattered Gradients Problem: If resnets are the answer, then what is the question? , 2017, ICML.

[2]  Ian J. Goodfellow,et al.  NIPS 2016 Tutorial: Generative Adversarial Networks , 2016, ArXiv.

[3]  Ajmal Mian,et al.  Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey , 2018, IEEE Access.

[4]  Colin Raffel,et al.  Thermometer Encoding: One Hot Way To Resist Adversarial Examples , 2018, ICLR.

[5]  Moustapha Cissé,et al.  Countering Adversarial Images using Input Transformations , 2018, ICLR.

[6]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[7]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[8]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[9]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[10]  Adversarial Examples THERMOMETER ENCODING: ONE HOT WAY TO RESIST , 2017 .

[11]  James Bailey,et al.  Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality , 2018, ICLR.

[12]  Patrick D. McDaniel,et al.  Making machine learning robust against adversarial inputs , 2018, Commun. ACM.

[13]  Kibok Lee,et al.  Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples , 2017, ICLR.

[14]  Rama Chellappa,et al.  Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models , 2018, ICLR.

[15]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[16]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[17]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[18]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[19]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[20]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[21]  Saibal Mukhopadhyay,et al.  Cascade Adversarial Machine Learning Regularized with a Unified Embedding , 2017, ICLR.

[22]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[23]  John C. Duchi,et al.  Certifying Some Distributional Robustness with Principled Adversarial Training , 2017, ICLR.

[24]  Lei Wu,et al.  Understanding and Enhancing the Transferability of Adversarial Examples , 2018, ArXiv.

[25]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).