Targeted Black-Box Adversarial Attack Method for Image Classification Models

Deep neural networks (DNNs) are widely applied to image classification tasks. Due to the fact that these models are usually vulnerable, subtle perturbations of pixels may lead to classification errors, which poses a serious threat to the success of DNN applications. Moreover, perturbations of pixels can also corrupt other pattern recognition models such as Naive Bayes (NB), Decision Tree (DT) and Random Forest (RF). In this paper, a general method is proposed to carry out targeted black-box attacks for image classification models. The proposed method can achieve targeted fool rates (TFRs) of 0.873 and 0.781 on CIFAR-10 dataset with and without the access to the training set of the target model respectively. For cross-model attacks, the proposed method can still achieve a TFR of 0.630 on CIFAR- 10. Furthermore, the proposed method is able to mount attacks for up to 100 classes on CIFAR-100 dataset with a TFR of 0.721, successfully handling 99 cases for each class. In our experiments, the proposed method shows higher performance and higher reliability than other black-box attack methods, with 0.123 greater maximum TFR and 0.602 greater minimum TFR than previous methods UPSET and ANGRI on CIFAR-10 in attacks trained on a single model.

[1]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[3]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Universal Adversarial Perturbations , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Moustapha Cissé,et al.  Houdini: Fooling Deep Structured Visual and Speech Recognition Models with Adversarial Examples , 2017, NIPS.

[6]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[7]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[8]  Ajmal Mian,et al.  Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey , 2018, IEEE Access.

[9]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[10]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[12]  Ananthram Swami,et al.  Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples , 2016, ArXiv.

[13]  Kouichi Sakurai,et al.  One Pixel Attack for Fooling Deep Neural Networks , 2017, IEEE Transactions on Evolutionary Computation.

[14]  Nina Narodytska,et al.  Simple Black-Box Adversarial Perturbations for Deep Networks , 2016, ArXiv.

[15]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Geoffrey E. Hinton,et al.  Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.

[18]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[19]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[20]  Graham W. Taylor,et al.  Deconvolutional networks , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[22]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[23]  Rama Chellappa,et al.  UPSET and ANGRI : Breaking High Performance Image Classifiers , 2017, ArXiv.

[24]  R. Venkatesh Babu,et al.  NAG: Network for Adversary Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[26]  Fabio Roli,et al.  Pattern Recognition Systems under Attack: Design Issues and Research Challenges , 2014, Int. J. Pattern Recognit. Artif. Intell..

[27]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).