GAP++: Learning to generate target-conditioned adversarial examples

Adversarial examples are perturbed inputs which can cause a serious threat for machine learning models. Finding these perturbations is such a hard task that we can only use the iterative methods to traverse. For computational efficiency, recent works use adversarial generative networks to model the distribution of both the universal or image-dependent perturbations directly. However, these methods generate perturbations only rely on input images. In this work, we propose a more general-purpose framework which infers target-conditioned perturbations dependent on both input image and target label. Different from previous single-target attack models, our model can conduct target-conditioned attacks by learning the relations of attack target and the semantics in image. Using extensive experiments on the datasets of MNIST and CIFAR10, we show that our method achieves superior performance with single target attack models and obtains high fooling rates with small perturbation norms.

[1]  Ian S. Fischer,et al.  Adversarial Transformation Networks: Learning to Generate Adversarial Examples , 2017, ArXiv.

[2]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[3]  Dawn Xiaodong Song,et al.  Delving into Transferable Adversarial Examples and Black-box Attacks , 2016, ICLR.

[4]  Song Bai,et al.  Regional Homogeneity: Towards Learning Transferable Universal Adversarial Perturbations Against Defenses , 2019, ECCV.

[5]  Toon Goedemé,et al.  Fooling Automated Surveillance Cameras: Adversarial Patches to Attack Person Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[6]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[7]  Mingyan Liu,et al.  Generating Adversarial Examples with Adversarial Networks , 2018, IJCAI.

[8]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[9]  R. Venkatesh Babu,et al.  NAG: Network for Adversary Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[11]  Sameer Singh,et al.  Generating Natural Adversarial Examples , 2017, ICLR.

[12]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  George Danezis,et al.  Learning Universal Adversarial Perturbations with Generative Models , 2017, 2018 IEEE Security and Privacy Workshops (SPW).

[14]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[15]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[16]  Shih-hong Tsai Customizing an Adversarial Example Generator with Class-Conditional GANs , 2018, ArXiv.

[17]  Yang Song,et al.  Constructing Unrestricted Adversarial Examples with Generative Models , 2018, NeurIPS.

[18]  Yuan He,et al.  Bilinear Representation for Language-based Image Editing Using Conditional Generative Adversarial Networks , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[19]  Isay Katsman,et al.  Generative Adversarial Perturbations , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).