Adversarial Training against Location-Optimized Adversarial Patches

Deep neural networks have been shown to be susceptible to adversarial examples -- small, imperceptible changes constructed to cause mis-classification in otherwise highly accurate image classifiers. As a practical alternative, recent work proposed so-called adversarial patches: clearly visible, but adversarially crafted rectangular patches in images. These patches can easily be printed and applied in the physical world. While defenses against imperceptible adversarial examples have been studied extensively, robustness against adversarial patches is poorly understood. In this work, we first devise a practical approach to obtain adversarial patches while actively optimizing their location within the image. Then, we apply adversarial training on these location-optimized adversarial patches and demonstrate significantly improved robustness on CIFAR10 and GTSRB. Additionally, in contrast to adversarial training on imperceptible adversarial examples, our adversarial patch training does not reduce accuracy.

[1]  Aleksander Madry,et al.  Robustness May Be at Odds with Accuracy , 2018, ICLR.

[2]  Timothy A. Mann,et al.  On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models , 2018, ArXiv.

[3]  Pin-Yu Chen,et al.  Attacking the Madry Defense Model with L1-based Adversarial Examples , 2017, ICLR.

[4]  Nicholas Carlini,et al.  On the Robustness of the CVPR 2018 White-Box Adversarial Example Defenses , 2018, ArXiv.

[5]  Tom Goldstein,et al.  Certified Defenses for Adversarial Patches , 2020, ICLR.

[6]  Johannes Stallkamp,et al.  Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition , 2012, Neural Networks.

[7]  J. Zico Kolter,et al.  Adversarial Robustness Against the Union of Multiple Perturbation Models , 2019, ICML.

[8]  Jun Zhu,et al.  Boosting Adversarial Attacks with Momentum , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Dale Schuurmans,et al.  Learning with a Strong Adversary , 2015, ArXiv.

[10]  Larry S. Davis,et al.  Universal Adversarial Training , 2018, AAAI.

[11]  Ajmal Mian,et al.  Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey , 2018, IEEE Access.

[12]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[13]  Andrew Gordon Wilson,et al.  Simple Black-box Adversarial Attacks , 2019, ICML.

[14]  Sameer Singh,et al.  Generating Natural Adversarial Examples , 2017, ICLR.

[15]  Matthias Hein,et al.  Sparse and Imperceivable Adversarial Attacks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Pedro H. O. Pinheiro,et al.  Adversarial Framing for Image and Video Classification , 2018, AAAI.

[17]  David A. Wagner,et al.  MagNet and "Efficient Defenses Against Adversarial Attacks" are Not Robust to Adversarial Examples , 2017, ArXiv.

[18]  Mark Lee,et al.  On Physical Adversarial Patches for Object Detection , 2019, ArXiv.

[19]  Dan Boneh,et al.  Adversarial Training and Robustness for Multiple Perturbations , 2019, NeurIPS.

[20]  Xin Liu,et al.  DPatch: Attacking Object Detectors with Adversarial Patches , 2018, ArXiv.

[21]  Michael I. Jordan,et al.  Theoretically Principled Trade-off between Robustness and Accuracy , 2019, ICML.

[22]  Larry S. Davis,et al.  Adversarial Training for Free! , 2019, NeurIPS.

[23]  Pan He,et al.  Adversarial Examples: Attacks and Defenses for Deep Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[24]  Yang Liu,et al.  Manifold Adversarial Learning , 2018, ArXiv.

[25]  Liang Tong,et al.  Defending Against Physically Realizable Attacks on Image Classification , 2020, ICLR.

[26]  Martha Larson,et al.  A Differentiable Color Filter for Generating Unrestricted Adversarial Images , 2020, ArXiv.

[27]  Bernt Schiele,et al.  Confidence-Calibrated Adversarial Training: Generalizing to Unseen Attacks , 2019, ICML.

[28]  Logan Engstrom,et al.  Evaluating and Understanding the Robustness of Adversarial Logit Pairing , 2018, ArXiv.

[29]  Kyle Hambrook,et al.  Recovery Guarantees for Compressible Signals with Adversarial Noise , 2019, ArXiv.

[30]  Simona Maggio,et al.  Robustness of Rotation-Equivariant Networks to Adversarial Perturbations , 2018, ArXiv.

[31]  Ludwig Schmidt,et al.  Unlabeled Data Improves Adversarial Robustness , 2019, NeurIPS.

[32]  Shin Ishii,et al.  Distributional Smoothing with Virtual Adversarial Training , 2015, ICLR 2016.

[33]  Dawn Xiaodong Song,et al.  Exploring the Space of Black-box Attacks on Deep Neural Networks , 2017, ArXiv.

[34]  Fabio Roli,et al.  Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning , 2018, CCS.

[35]  Inderjit S. Dhillon,et al.  The Limitations of Adversarial Training and the Blind-Spot Attack , 2019, ICLR.

[36]  Martín Abadi,et al.  Adversarial Patch , 2017, ArXiv.

[37]  Aleksander Madry,et al.  A Rotation and a Translation Suffice: Fooling CNNs with Simple Transformations , 2017, ArXiv.

[38]  J. Zico Kolter,et al.  Fast is better than free: Revisiting adversarial training , 2020, ICLR.

[39]  Giovanni S. Alberti,et al.  ADef: an Iterative Algorithm to Construct Adversarial Deformations , 2018, ICLR.

[40]  Jungwoo Lee,et al.  Generative Adversarial Trainer: Defense to Adversarial Perturbations with GAN , 2017, ArXiv.

[41]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Geometric Robustness of Deep Networks: Analysis and Improvement , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42]  Nicholas Carlini,et al.  Is AmI (Attacks Meet Interpretability) Robust to Adversarial Examples? , 2019, ArXiv.

[43]  Salman Khan,et al.  Local Gradients Smoothing: Defense Against Localized Adversarial Attacks , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[44]  Alois Knoll,et al.  Copy and Paste: A Simple But Effective Initialization Method for Black-Box Adversarial Attacks , 2019, CVPR 2019.

[45]  Atul Prakash,et al.  Robust Physical-World Attacks on Deep Learning Visual Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[46]  Mitali Bafna,et al.  Thwarting Adversarial Examples: An L_0-Robust Sparse Fourier Transform , 2018, NeurIPS.

[47]  Uri Shaham,et al.  Understanding Adversarial Training: Increasing Local Stability of Neural Nets through Robust Optimization , 2015, ArXiv.

[48]  Yang Song,et al.  Generative Adversarial Examples , 2018, NIPS 2018.

[49]  Jiliang Tang,et al.  Adversarial Attacks and Defenses in Images, Graphs and Text: A Review , 2019, International Journal of Automation and Computing.

[50]  Matthias Bethge,et al.  Comment on "Biologically inspired protection of deep networks from adversarial attacks" , 2017, ArXiv.

[51]  Michael J. Black,et al.  Attacking Optical Flow , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[52]  Matthew Mirman,et al.  Differentiable Abstract Interpretation for Provably Robust Neural Networks , 2018, ICML.

[53]  Nicholas Carlini,et al.  Unrestricted Adversarial Examples , 2018, ArXiv.

[54]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[55]  Jamie Hayes,et al.  On Visible Adversarial Perturbations & Digital Watermarking , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[56]  Radha Poovendran,et al.  Semantic Adversarial Examples , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[57]  Jinfeng Yi,et al.  ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models , 2017, AISec@CCS.

[58]  Tom Goldstein,et al.  Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets , 2019, ArXiv.

[59]  Matthias Hein,et al.  Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks , 2020, ICML.

[60]  Weiming Zhang,et al.  Enhanced Attacks on Defensively Distilled Deep Neural Networks , 2017, ArXiv.

[61]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[62]  Anqi Xu,et al.  Physical Adversarial Textures That Fool Visual Object Tracking , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[63]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[64]  Matthias Bethge,et al.  Robust Perception through Analysis by Synthesis , 2018, ArXiv.

[65]  Bernt Schiele,et al.  Disentangling Adversarial Robustness and Generalization , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Nicolas Flammarion,et al.  Square Attack: a query-efficient black-box adversarial attack via random search , 2020, ECCV.

[67]  David A. Wagner,et al.  Defensive Distillation is Not Robust to Adversarial Examples , 2016, ArXiv.

[68]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[69]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[70]  Matthias Hein,et al.  Logit Pairing Methods Can Fool Gradient-Based Attacks , 2018, ArXiv.

[71]  John C. Duchi,et al.  Certifiable Distributional Robustness with Principled Adversarial Training , 2017, ArXiv.

[72]  Yoshua Bengio,et al.  Interpolated Adversarial Training: Achieving Robust Neural Networks Without Sacrificing Too Much Accuracy , 2019, AISec@CCS.

[73]  Qiang Xu,et al.  Towards Imperceptible and Robust Adversarial Example Attacks against Neural Networks , 2018, AAAI.

[74]  Deniz Erdogmus,et al.  Structured Adversarial Attack: Towards General Implementation and Better Interpretability , 2018, ICLR.

[75]  Jonas Geiping,et al.  Witchcraft: Efficient PGD Attacks with Random Step Size , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[76]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[77]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[78]  Aleksander Madry,et al.  On Adaptive Attacks to Adversarial Example Defenses , 2020, NeurIPS.

[79]  Aditi Raghunathan,et al.  Adversarial Training Can Hurt Generalization , 2019, ArXiv.

[80]  Yoav Goldberg,et al.  LaVAN: Localized and Visible Adversarial Noise , 2018, ICML.

[81]  Jianyu Wang,et al.  Bilateral Adversarial Training: Towards Fast Training of More Robust Models Against Adversarial Attacks , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[82]  Logan Engstrom,et al.  Black-box Adversarial Attacks with Limited Queries and Information , 2018, ICML.

[83]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[84]  John C. Duchi,et al.  Certifying Some Distributional Robustness with Principled Adversarial Training , 2017, ICLR.

[85]  Po-Sen Huang,et al.  Are Labels Required for Improving Adversarial Robustness? , 2019, NeurIPS.

[86]  Mingyan Liu,et al.  Spatially Transformed Adversarial Examples , 2018, ICLR.