A novel and universal GAN-based countermeasure to recover adversarial examples to benign examples

Abstract Some recent studies have demonstrated that the deep neural network (DNN) is vulnerable to adversarial examples, which contain some subtle and human-imperceptible perturbations. Although numerous countermeasures have been proposed and play a significant role, most of them all have some flaws and are only effective for certain types of adversarial examples. In the paper, we present a novel and universal countermeasure to recover multiple types of adversarial examples to benign examples before they are fed into the deep neural network. The idea is to model the mapping between adversarial examples and benign examples using a generative adversarial network (GAN). Its GAN architecture consists of a generator based on UNET, a discriminator based on ACGAN, and a newly added third-party classifier. The UNET can enhance the capacity of the generator to recover adversarial examples to benign examples. The loss function makes full use of the advantages of ACGAN and WGAN-GP to ensure the stability of the training process and accelerate its convergence. Besides, a classification loss and a perceptual loss, all from the third-party classifier, are employed to improve further the generator's capacity to eliminate adversarial perturbations. Experiments are conducted on the MNIST, CIFAR10, and IMAGENET datasets. First, we perform ablation experiments to prove the proposed countermeasure's validity. Then, we defend against seven types of state-of-the-art adversarial examples on four deep neural networks and compare them with six existing countermeasures. Finally, the experimental results demonstrate that the proposed countermeasure is universal and has a more excellent performance than other countermeasures. The experimental code is available at https://github.com/Afreadyang/IAED-GAN .

[1]  Yonghui Wu,et al.  ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context , 2020, INTERSPEECH.

[2]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[3]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Lap-Pui Chau,et al.  Improved Network Robustness with Adversary Critic , 2018, NeurIPS.

[5]  Jinfeng Yi,et al.  ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models , 2017, AISec@CCS.

[6]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[8]  Kouichi Sakurai,et al.  One Pixel Attack for Fooling Deep Neural Networks , 2017, IEEE Transactions on Evolutionary Computation.

[9]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[10]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[11]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[12]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[13]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[14]  Mingyan Liu,et al.  Generating Adversarial Examples with Adversarial Networks , 2018, IJCAI.

[15]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[17]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[18]  Hazem M. El-Bakry,et al.  CNN for Handwritten Arabic Digits Recognition Based on LeNet-5 , 2016, AISI.

[19]  Loïc Royer,et al.  Noise2Self: Blind Denoising by Self-Supervision , 2019, ICML.

[20]  Patrick Kenny,et al.  Deep Speaker Embeddings for Short-Duration Speaker Verification , 2017, INTERSPEECH.

[21]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Chong Wang,et al.  Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.

[23]  Yanjun Qi,et al.  Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks , 2017, NDSS.

[24]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[25]  Logan Engstrom,et al.  Synthesizing Robust Adversarial Examples , 2017, ICML.

[26]  Xiaogang Wang,et al.  Residual Attention Network for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Thomas Wolf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[28]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[29]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[30]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[31]  Jun Zhu,et al.  Boosting Adversarial Attacks with Momentum , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Matthias Bethge,et al.  Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models , 2017, ICLR.

[33]  Hassan Foroosh,et al.  CAMOU: Learning Physical Vehicle Camouflages to Adversarially Attack Detectors in the Wild , 2018, ICLR.

[34]  Yun Liao,et al.  Research on Speech Enhancement Algorithm Based on SA-Unet , 2019, 2019 4th International Conference on Mechanical, Control and Computer Engineering (ICMCCE).