Discovering Adversarial Examples with Momentum

Machine learning models, especially Deep Neural Networks, are vulnerable to adversarial examples---malicious inputs crafted by adding small noises to real examples, but fool the models. Adversarial examples transfer from one model to another, enabling black-box attacks to real-world applications. In this paper, we propose a strong attack algorithm named momentum iterative fast gradient sign method (MI-FGSM) to discover adversarial examples. MI-FGSM is an extension of iterative fast gradient sign method (I-FGSM) but improves the transferability significantly. Besides, we study how to attack an ensemble of models efficiently. Experiments demonstrate the effectiveness of the proposed algorithm. We hope that MI-FGSM can serve as a benchmark attack algorithm for evaluating the robustness of various models and defense methods.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Hang Su,et al.  Towards Interpretable Deep Neural Networks by Leveraging Adversarial Examples , 2017, ArXiv.

[3]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Boris Polyak Some methods of speeding up the convergence of iteration methods , 1964 .

[6]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[7]  Jun Zhu,et al.  Robust Deep Learning via Reverse Cross-Entropy Training and Thresholding Test , 2017, ArXiv.

[8]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[9]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[10]  Jan Hendrik Metzen,et al.  On Detecting Adversarial Perturbations , 2017, ICLR.

[11]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[14]  Blaine Nelson,et al.  Adversarial machine learning , 2019, AISec '11.

[15]  Christopher Meek,et al.  Good Word Attacks on Statistical Spam Filters , 2005, CEAS.

[16]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[17]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[18]  Dawn Xiaodong Song,et al.  Delving into Transferable Adversarial Examples and Black-box Attacks , 2016, ICLR.

[19]  Geoffrey E. Hinton,et al.  Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  Pedro M. Domingos,et al.  Adversarial classification , 2004, KDD.

[21]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[22]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[24]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Universal Adversarial Perturbations , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Dong Yu,et al.  Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.

[27]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[28]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..