Recent Advances in Adversarial Training for Adversarial Robustness

Adversarial training is one of the most effective approaches for deep learning models to defend against adversarial examples. Unlike other defense strategies, adversarial training aims to enhance the robustness of models intrinsically. During the past few years, adversarial training has been studied and discussed from various aspects, which deserves a comprehensive review. For the first time in this survey, we systematically review the recent progress on adversarial training for adversarial robustness with a novel taxonomy. Then we discuss the generalization problems in adversarial training from three perspectives and highlight the challenges which are not fully tackled. Finally, we present potential future directions.

[1]  Silvio Savarese,et al.  Generalizing to Unseen Domains via Adversarial Data Augmentation , 2018, NeurIPS.

[2]  Woojin Lee,et al.  Understanding Catastrophic Overfitting in Single-step Adversarial Training , 2020, AAAI.

[3]  Dawn Song,et al.  Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty , 2019, NeurIPS.

[4]  Dan Boneh,et al.  The Space of Transferable Adversarial Examples , 2017, ArXiv.

[5]  DeLiang Wang,et al.  On Adversarial Training and Loss Functions for Speech Enhancement , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Nicholas Carlini,et al.  On the Robustness of the CVPR 2018 White-Box Adversarial Example Defenses , 2018, ArXiv.

[7]  Haichao Zhang,et al.  Defense Against Adversarial Attacks Using Feature Scattering-based Adversarial Training , 2019, NeurIPS.

[8]  Suman Jana,et al.  Towards Understanding Fast Adversarial Training , 2020, ArXiv.

[9]  Alan L. Yuille,et al.  Adversarial Examples for Semantic Segmentation and Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Le Song,et al.  Adversarial Attack on Graph Structured Data , 2018, ICML.

[11]  J. Zico Kolter,et al.  Adversarial Robustness Against the Union of Multiple Perturbation Models , 2019, ICML.

[12]  Nicolas Flammarion,et al.  Understanding and Improving Fast Adversarial Training , 2020, NeurIPS.

[13]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[14]  Andrew Slavin Ross,et al.  Improving the Adversarial Robustness and Interpretability of Deep Neural Networks by Regularizing their Input Gradients , 2017, AAAI.

[15]  Christopher Meek,et al.  Adversarial learning , 2005, KDD '05.

[16]  Samuel Henrique Silva,et al.  Opportunities and Challenges in Deep Learning Adversarial Robustness: A Survey , 2020, ArXiv.

[17]  Dan Boneh,et al.  Adversarial Training and Robustness for Multiple Perturbations , 2019, NeurIPS.

[18]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[19]  Ajmal Mian,et al.  Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey , 2018, IEEE Access.

[20]  Yi Sun,et al.  Transfer of Adversarial Robustness Between Perturbation Types , 2019, ArXiv.

[21]  Charles Jin,et al.  Manifold Regularization for Locally Stable Deep Neural Networks , 2020 .

[22]  Elham Tabassi,et al.  A taxonomy and terminology of adversarial machine learning , 2019 .

[23]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[24]  Thomas G. Dietterich,et al.  In Advances in Neural Information Processing Systems 12 , 1991, NIPS 1991.

[25]  Uri Shaham,et al.  Understanding adversarial training: Increasing local stability of supervised models through robust optimization , 2015, Neurocomputing.

[26]  Bernt Schiele,et al.  Confidence-Calibrated Adversarial Training: Generalizing to Unseen Attacks , 2019, ICML.

[27]  Nikhil R. Pal Call for nominations / Applications for the position of the Editor-in-Chief of IEEE Transactions on Emerging Topics in Computational Intelligence , 2016, IEEE Trans. Evol. Comput..

[28]  R. Venkatesh Babu,et al.  Regularizers for Single-step Adversarial Training , 2020, ArXiv.

[29]  Xiaochun Cao,et al.  Transferable Adversarial Attacks for Image and Video Object Detection , 2018, IJCAI.

[30]  Michael I. Jordan,et al.  Theoretically Principled Trade-off between Robustness and Accuracy , 2019, ICML.

[31]  Bin Dong,et al.  You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle , 2019, NeurIPS.

[32]  Kannan Ramchandran,et al.  Rademacher Complexity for Adversarially Robust Generalization , 2018, ICML.

[33]  Mingyi Hong,et al.  Nonconvex Min-Max Optimization: Applications, Challenges, and Recent Theoretical Advances , 2020, IEEE Signal Processing Magazine.

[34]  Nicolò Cesa-Bianchi,et al.  Advances in Neural Information Processing Systems 31 , 2018, NIPS 2018.

[35]  Ruitong Huang,et al.  Max-Margin Adversarial (MMA) Training: Direct Input Space Margin Maximization through Adversarial Training , 2018, ICLR.

[36]  Girish Chowdhary,et al.  Robust Deep Reinforcement Learning with Adversarial Attacks , 2017, AAMAS.

[37]  Jun Zhu,et al.  Adversarial Distributional Training for Robust Deep Learning , 2020, NeurIPS.

[38]  Larry S. Davis,et al.  Adversarial Training for Free! , 2019, NeurIPS.

[39]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[40]  Hang Su,et al.  Bag of Tricks for Adversarial Training , 2020, ICLR.

[41]  Ousmane Amadou Dia,et al.  Adversarial Examples in Modern Machine Learning: A Review , 2019, ArXiv.

[42]  W. Marsden I and J , 2012 .

[43]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Robustness via Curvature Regularization, and Vice Versa , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Hai Li,et al.  DVERGE: Diversifying Vulnerabilities for Enhanced Robust Generation of Ensembles , 2020, NeurIPS.

[45]  Mohan S. Kankanhalli,et al.  Attacks Which Do Not Kill Training Make Adversarial Learning Stronger , 2020, ICML.

[46]  Kun He,et al.  Improving the Generalization of Adversarial Training with Domain Adaptation , 2018, ICLR.

[47]  Amir Najafi,et al.  Robustness to Adversarial Perturbations in Learning from Incomplete Data , 2019, NeurIPS.

[48]  Sung Ju Hwang,et al.  Adversarial Self-Supervised Contrastive Learning , 2020, NeurIPS.

[49]  J. Zico Kolter,et al.  Fast is better than free: Revisiting adversarial training , 2020, ICLR.

[50]  Xiaolin Hu,et al.  Defense Against Adversarial Attacks Using High-Level Representation Guided Denoiser , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[51]  Hang Su,et al.  Boosting Adversarial Training with Hypersphere Embedding , 2020, NeurIPS.

[52]  Pushmeet Kohli,et al.  Adversarial Robustness through Local Linearization , 2019, NeurIPS.

[53]  Baishakhi Ray,et al.  Metric Learning for Adversarial Robustness , 2019, NeurIPS.

[54]  Pan He,et al.  Adversarial Examples: Attacks and Defenses for Deep Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[55]  Po-Sen Huang,et al.  Are Labels Required for Improving Adversarial Robustness? , 2019, NeurIPS.

[56]  Prateek Mittal,et al.  PAC-learning in the presence of evasion adversaries , 2018, NIPS 2018.

[57]  Bernt Schiele,et al.  Disentangling Adversarial Robustness and Generalization , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Vlado Menkovski,et al.  Bridging the Performance Gap between FGSM and PGD Adversarial Training , 2020, ArXiv.

[59]  Ning Chen,et al.  Improving Adversarial Robustness via Promoting Ensemble Diversity , 2019, ICML.

[60]  Mani B. Srivastava,et al.  Generating Natural Language Adversarial Examples , 2018, EMNLP.

[61]  Harini Kannan,et al.  Adversarial Logit Pairing , 2018, NIPS 2018.

[62]  Di He,et al.  Adversarially Robust Generalization Just Requires More Unlabeled Data , 2019, ArXiv.

[63]  Aditi Raghunathan,et al.  Adversarial Training Can Hurt Generalization , 2019, ArXiv.

[64]  Mislav Balunovic,et al.  Adversarial Training and Provable Defenses: Bridging the Gap , 2020, ICLR.

[65]  Logan Engstrom,et al.  Evaluating and Understanding the Robustness of Adversarial Logit Pairing , 2018, ArXiv.

[66]  Sandy H. Huang,et al.  Adversarial Attacks on Neural Network Policies , 2017, ICLR.

[67]  Dawn Xiaodong Song,et al.  Curriculum Adversarial Training , 2018, IJCAI.

[68]  Jianyu Wang,et al.  Bilateral Adversarial Training: Towards Fast Training of More Robust Models Against Adversarial Attacks , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[69]  Lucy Rosenbloom arXiv , 2019, The Charleston Advisor.

[70]  Dima Damen,et al.  Recognizing linked events: Searching the space of feasible explanations , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[71]  R. Venkatesh Babu,et al.  Single-Step Adversarial Training With Dropout Scheduling , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  Kimin Lee,et al.  Using Pre-Training Can Improve Model Robustness and Uncertainty , 2019, ICML.

[73]  James Bailey,et al.  On the Convergence and Robustness of Adversarial Training , 2021, ICML.

[74]  Dale Schuurmans,et al.  Learning with a Strong Adversary , 2015, ArXiv.

[75]  Ludwig Schmidt,et al.  Unlabeled Data Improves Adversarial Robustness , 2019, NeurIPS.

[76]  Tao Wei,et al.  Fooling Detection Alone is Not Enough: Adversarial Attack against Multiple Object Tracking , 2020, ICLR.

[77]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[78]  Pin-Yu Chen,et al.  CAT: Customized Adversarial Training for Improved Robustness , 2020, IJCAI.

[79]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[80]  Surya Ganguli,et al.  Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.

[81]  Guang Cheng,et al.  On the Generalization Properties of Adversarial Training , 2020, AISTATS.

[82]  Cyrus Rashtchian,et al.  A Closer Look at Accuracy vs. Robustness , 2020, NeurIPS.

[83]  Colin Raffel,et al.  Thermometer Encoding: One Hot Way To Resist Adversarial Examples , 2018, ICLR.

[84]  Kun He,et al.  Robust Local Features for Improving the Generalization of Adversarial Training , 2020, ICLR.

[85]  Pedro M. Domingos,et al.  Adversarial classification , 2004, KDD.

[86]  Chitta Baral,et al.  Attribute-Guided Adversarial Training for Robustness to Natural Perturbations , 2020, ArXiv.

[87]  Tom Goldstein,et al.  Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets , 2019, ArXiv.

[88]  Cho-Jui Hsieh,et al.  Evaluating Robustness of Deep Image Super-Resolution Against Adversarial Attacks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[89]  Matthias Bethge,et al.  Towards the first adversarially robust neural network model on MNIST , 2018, ICLR.

[90]  Aleksander Madry,et al.  Adversarially Robust Generalization Requires More Data , 2018, NeurIPS.

[91]  Leslie N. Smith,et al.  A Useful Taxonomy for Adversarial Robustness of Neural Networks , 2019, Trends in Computer Science and Information Technology.

[92]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[93]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[94]  James Bailey,et al.  Improving Adversarial Robustness Requires Revisiting Misclassified Examples , 2020, ICLR.

[95]  Ting Chen,et al.  Robust Pre-Training by Adversarial Contrastive Learning , 2020, NeurIPS.

[96]  Graham W. Taylor,et al.  Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[97]  Sungroh Yoon,et al.  Adversarial Vertex Mixup: Toward Better Adversarially Robust Generalization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[98]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[99]  Quoc V. Le,et al.  Smooth Adversarial Training , 2020, ArXiv.

[100]  Jinfeng Yi,et al.  Is Robustness the Cost of Accuracy? - A Comprehensive Study on the Robustness of 18 Deep Image Classification Models , 2018, ECCV.

[101]  Moinuddin K. Qureshi,et al.  Improving Adversarial Robustness of Ensembles with Diversity Training , 2019, ArXiv.

[102]  Xi Peng,et al.  Learning to Learn Single Domain Generalization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[103]  Moustapha Cissé,et al.  Countering Adversarial Images using Input Transformations , 2018, ICLR.

[104]  Wei Xu,et al.  Adversarial Interpolation Training: A Simple Approach for Improving Model Robustness , 2019 .

[105]  David A. Wagner,et al.  Audio Adversarial Examples: Targeted Attacks on Speech-to-Text , 2018, 2018 IEEE Security and Privacy Workshops (SPW).