Recent Advances in Understanding Adversarial Robustness of Deep Neural Networks

Adversarial examples are inevitable on the road of pervasive applications of deep neural networks (DNN). Imperceptible perturbations applied on natural samples can lead DNN-based classifiers to output wrong prediction with fair confidence score. It is increasingly important to obtain models with high robustness that are resistant to adversarial examples. In this paper, we survey recent advances in how to understand such intriguing property, i.e. adversarial robustness, from different perspectives. We give preliminary definitions on what adversarial attacks and robustness are. After that, we study frequently-used benchmarks and mention theoretically-proved bounds for adversarial robustness. We then provide an overview on analyzing correlations among adversarial robustness and other critical indicators of DNN models. Lastly, we introduce recent arguments on potential costs of adversarial training which have attracted wide attention from the research community.

[1]  Dacheng Tao,et al.  Theoretical Analysis of Adversarial Learning: A Minimax Approach , 2018, NeurIPS.

[2]  Jinfeng Yi,et al.  Is Robustness the Cost of Accuracy? - A Comprehensive Study on the Robustness of 18 Deep Image Classification Models , 2018, ECCV.

[3]  Nic Ford,et al.  Adversarial Examples Are a Natural Consequence of Test Error in Noise , 2019, ICML.

[4]  Michael I. Jordan,et al.  Theoretically Principled Trade-off between Robustness and Accuracy , 2019, ICML.

[5]  Ousmane Amadou Dia,et al.  Adversarial Examples in Modern Machine Learning: A Review , 2019, ArXiv.

[6]  Ieee Staff 2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, May 22-26, 2017 , 2017, IEEE Symposium on Security and Privacy.

[7]  Changshui Zhang,et al.  Sparse DNNs with Improved Adversarial Robustness , 2018, NeurIPS.

[8]  Matthias Hein,et al.  Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks , 2020, ICML.

[9]  Frank Wang,et al.  The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Sandy H. Huang,et al.  Adversarial Attacks on Neural Network Policies , 2017, ICLR.

[11]  Neil A. Dodgson,et al.  Proceedings Ninth IEEE International Conference on Computer Vision , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[12]  J. Zico Kolter,et al.  Fast is better than free: Revisiting adversarial training , 2020, ICLR.

[13]  Matthias Bethge,et al.  ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , 2018, ICLR.

[14]  Cho-Jui Hsieh,et al.  ON EXTENSIONS OF CLEVER: A NEURAL NETWORK ROBUSTNESS EVALUATION ALGORITHM , 2018, 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[15]  Luyu Wang,et al.  On the Sensitivity of Adversarial Robustness to Input Data Distributions , 2018, ICLR.

[16]  Moustapha Cissé,et al.  Countering Adversarial Images using Input Transformations , 2018, ICLR.

[17]  Jinfeng Yi,et al.  Evaluating the Robustness of Neural Networks: An Extreme Value Theory Approach , 2018, ICLR.

[18]  Le Song,et al.  Adversarial Attack on Graph Structured Data , 2018, ICML.

[19]  Bin Dong,et al.  You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle , 2019, NeurIPS.

[20]  Dima Damen,et al.  Recognizing linked events: Searching the space of feasible explanations , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Tao Wei,et al.  Fooling Detection Alone is Not Enough: Adversarial Attack against Multiple Object Tracking , 2020, ICLR.

[22]  Tom Goldstein,et al.  Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets , 2019, ArXiv.

[23]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[24]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[25]  Charles Jin,et al.  Manifold Regularization for Locally Stable Deep Neural Networks , 2020 .

[26]  Luca Iocchi,et al.  Predicting Future Agent Motions for Dynamic Environments , 2016, 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA).

[27]  James Bailey,et al.  Understanding Adversarial Attacks on Deep Learning Based Medical Image Analysis Systems , 2019, Pattern Recognit..

[28]  Alan L. Yuille,et al.  Adversarial Examples for Semantic Segmentation and Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[29]  Mislav Balunovic,et al.  Adversarial Training and Provable Defenses: Bridging the Gap , 2020, ICLR.

[30]  David A. Wagner,et al.  Audio Adversarial Examples: Targeted Attacks on Speech-to-Text , 2018, 2018 IEEE Security and Privacy Workshops (SPW).

[31]  Pascal Frossard,et al.  Analysis of classifiers’ robustness to adversarial perturbations , 2015, Machine Learning.

[32]  Eric P. Xing,et al.  High-Frequency Component Helps Explain the Generalization of Convolutional Neural Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Aleksander Madry,et al.  Adversarially Robust Generalization Requires More Data , 2018, NeurIPS.

[34]  Daniel Cullina,et al.  Lower Bounds on Adversarial Robustness from Optimal Transport , 2019, NeurIPS.

[35]  Ajmal Mian,et al.  Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey , 2018, IEEE Access.

[36]  Bernhard Schölkopf,et al.  First-Order Adversarial Vulnerability of Neural Networks and Input Dimension , 2018, ICML.

[37]  Xiaochun Cao,et al.  Transferable Adversarial Attacks for Image and Video Object Detection , 2018, IJCAI.

[38]  Cyrus Rashtchian,et al.  A Closer Look at Accuracy vs. Robustness , 2020, NeurIPS.

[39]  Justin Gilmer,et al.  MNIST-C: A Robustness Benchmark for Computer Vision , 2019, ArXiv.

[40]  Thomas G. Dietterich,et al.  Benchmarking Neural Network Robustness to Common Corruptions and Perturbations , 2018, ICLR.

[41]  Xiaolin Hu,et al.  Defense Against Adversarial Attacks Using High-Level Representation Guided Denoiser , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42]  Ning Chen,et al.  Rethinking Softmax Cross-Entropy Loss for Adversarial Robustness , 2019, ICLR.

[43]  W. Marsden I and J , 2012 .

[44]  Liang Zhao,et al.  Interpreting and Evaluating Neural Network Robustness , 2019, IJCAI.

[45]  Johannes Stallkamp,et al.  The German Traffic Sign Recognition Benchmark: A multi-class classification competition , 2011, The 2011 International Joint Conference on Neural Networks.

[46]  Kun He,et al.  Improving the Generalization of Adversarial Training with Domain Adaptation , 2018, ICLR.

[47]  Carola-Bibiane Schönlieb,et al.  On the Connection Between Adversarial Robustness and Saliency Map Interpretability , 2019, ICML.

[48]  Aleksander Madry,et al.  Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.

[49]  Zhanxing Zhu,et al.  Interpreting Adversarially Trained Convolutional Neural Networks , 2019, ICML.

[50]  Mani B. Srivastava,et al.  Generating Natural Language Adversarial Examples , 2018, EMNLP.