TEAM: We Need More Powerful Adversarial Examples for DNNs

Although deep neural networks (DNNs) have achieved success in many application fields, it is still vulnerable to imperceptible adversarial examples that can lead to misclassification of DNNs easily. To overcome this challenge, many defensive methods are proposed. Indeed, a powerful adversarial example is a key benchmark to measure these defensive mechanisms. In this paper, we propose a novel method (TEAM, Taylor Expansion-Based Adversarial Methods) to generate more powerful adversarial examples than previous methods. The main idea is to craft adversarial examples by minimizing the confidence of the ground-truth class under untargeted attacks or maximizing the confidence of the target class under targeted attacks. Specifically, we define the new objective functions that approximate DNNs by using the second-order Taylor expansion within a tiny neighborhood of the input. Then the Lagrangian multiplier method is used to obtain the optimize perturbations for these objective functions. To decrease the amount of computation, we further introduce the Gauss-Newton (GN) method to speed it up. Finally, the experimental result shows that our method can reliably produce adversarial examples with 100% attack success rate (ASR) while only by smaller perturbations. In addition, the adversarial example generated with our method can defeat defensive distillation based on gradient masking.

[1]  June-Goo Lee,et al.  Deep Learning in Medical Imaging: General Overview , 2017, Korean journal of radiology.

[2]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[3]  David Balduzzi,et al.  Neural Taylor Approximations: Convergence and Exploration in Rectifier Networks , 2016, ICML.

[4]  Yann LeCun,et al.  Convolutional networks and applications in vision , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[5]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[6]  Yanjie Yao,et al.  Vehicle License Plate Recognition Based on Extremal Regions and Restricted Boltzmann Machines , 2016, IEEE Transactions on Intelligent Transportation Systems.

[7]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Alan L. Yuille,et al.  Improving Transferability of Adversarial Examples With Input Diversity , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Jianlin Cheng,et al.  A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[11]  Ioannis Mitliagkas,et al.  Fortified Networks: Improving the Robustness of Deep Networks by Modeling the Manifold of Hidden Representations , 2018, ArXiv.

[12]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[13]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[14]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[15]  Ian J. Goodfellow,et al.  Technical Report on the CleverHans v2.1.0 Adversarial Examples Library , 2016 .

[16]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[17]  Djemel Ziou,et al.  Image Quality Metrics: PSNR vs. SSIM , 2010, 2010 20th International Conference on Pattern Recognition.

[18]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[19]  Yoshua Bengio,et al.  Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..

[20]  Richard M. Murray,et al.  Detecting Adversarial Examples via Neural Fingerprinting , 2018, ArXiv.

[21]  Pierre Baldi,et al.  Deep autoencoder neural networks for gene ontology annotation predictions , 2014, BCB.

[22]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[23]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[24]  Blase Ur,et al.  Fast, Lean, and Accurate: Modeling Password Guessability Using Neural Networks , 2016, USENIX Annual Technical Conference.

[25]  Lukás Burget,et al.  Strategies for training large scale neural network language models , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.

[26]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[27]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[28]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Shin Ishii,et al.  Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[31]  Tara N. Sainath,et al.  FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .

[32]  David A. Wagner,et al.  MagNet and "Efficient Defenses Against Adversarial Attacks" are Not Robust to Adversarial Examples , 2017, ArXiv.