Natural Adversarial Examples

We introduce natural adversarial examples -- real-world, unmodified, and naturally occurring examples that cause classifier accuracy to significantly degrade. We curate 7,500 natural adversarial examples and release them in an ImageNet classifier test set that we call ImageNet-A. This dataset serves as a new way to measure classifier robustness. Like l_p adversarial examples, ImageNet-A examples successfully transfer to unseen or black-box classifiers. For example, on ImageNet-A a DenseNet-121 obtains around 2% accuracy, an accuracy drop of approximately 90%. Recovering this accuracy is not simple because ImageNet-A examples exploit deep flaws in current classifiers including their over-reliance on color, texture, and background cues. We observe that popular training techniques for improving robustness have little effect, but we show that some architectural changes can enhance robustness to natural adversarial examples. Future research is required to enable robust generalization to this hard ImageNet test set.

[1]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Graham W. Taylor,et al.  Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[3]  Nicholas Carlini,et al.  Unrestricted Adversarial Examples , 2018, ArXiv.

[4]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[5]  Graham W. Taylor,et al.  Learning Confidence for Out-of-Distribution Detection in Neural Networks , 2018, ArXiv.

[6]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Iasonas Kokkinos,et al.  Describing Textures in the Wild , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Gregory Shakhnarovich,et al.  Examining the Impact of Blur on Recognition by Convolutional Networks , 2016, ArXiv.

[11]  Omer Levy,et al.  Annotation Artifacts in Natural Language Inference Data , 2018, NAACL.

[12]  Ryan P. Adams,et al.  Motivating the Rules of the Game for Adversarial Example Research , 2018, ArXiv.

[13]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[14]  Martín Abadi,et al.  Adversarial Patch , 2017, ArXiv.

[15]  Thomas G. Dietterich,et al.  Benchmarking Neural Network Robustness to Common Corruptions and Perturbations , 2018, ICLR.

[16]  Mingyan Liu,et al.  Spatially Transformed Adversarial Examples , 2018, ICLR.

[17]  Dawn Song,et al.  Aligning AI With Shared Human Values , 2020, ICLR.

[18]  Yejin Choi,et al.  WINOGRANDE: An Adversarial Winograd Schema Challenge at Scale , 2020, AAAI.

[19]  Tengyu Ma,et al.  Verified Uncertainty Calibration , 2019, NeurIPS.

[20]  Kilian Q. Weinberger,et al.  On Feature Normalization and Data Augmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Alan L. Yuille,et al.  Feature Denoising for Improving Adversarial Robustness , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Eric P. Xing,et al.  Learning Robust Global Representations by Penalizing Local Predictive Power , 2019, NeurIPS.

[23]  Lifu Tu,et al.  Pay Attention to the Ending:Strong Neural Baselines for the ROC Story Cloze Task , 2017, ACL.

[24]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[25]  Matthieu Cord,et al.  Training data-efficient image transformers & distillation through attention , 2020, ICML.

[26]  Matthias Hein,et al.  Towards neural networks that provably know when they don't know , 2020, ICLR.

[27]  Yejin Choi,et al.  PIQA: Reasoning about Physical Commonsense in Natural Language , 2019, AAAI.

[28]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Gabriel Stanovsky,et al.  DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs , 2019, NAACL.

[30]  Benjamin Recht,et al.  Do ImageNet Classifiers Generalize to ImageNet? , 2019, ICML.

[31]  Ian S. Fischer,et al.  Adversarial Transformation Networks: Learning to Generate Adversarial Examples , 2017, ArXiv.

[32]  Kaiming He,et al.  Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.

[33]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[34]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[35]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Seong Joon Oh,et al.  CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[37]  Matthias Bethge,et al.  Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet , 2019, ICLR.

[38]  Ekin D. Cubuk,et al.  A Fourier Perspective on Model Robustness in Computer Vision , 2019, NeurIPS.

[39]  D. Song,et al.  The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[40]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[41]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[42]  Radha Poovendran,et al.  Semantic Adversarial Examples , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[43]  Takaya Saito,et al.  The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets , 2015, PloS one.

[44]  Moustapha Cissé,et al.  ConvNets and ImageNet Beyond Accuracy: Understanding Mistakes and Uncovering Biases , 2017, ECCV.

[45]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[46]  Ali Farhadi,et al.  HellaSwag: Can a Machine Really Finish Your Sentence? , 2019, ACL.

[47]  Dawn Song,et al.  Scaling Out-of-Distribution Detection for Real-World Settings. , 2020 .

[48]  Alexander Binder,et al.  Unmasking Clever Hans predictors and assessing what machines really learn , 2019, Nature Communications.

[49]  J. Zico Kolter,et al.  Fast is better than free: Revisiting adversarial training , 2020, ICLR.

[50]  Pin-Yu Chen,et al.  Attacking the Madry Defense Model with L1-based Adversarial Examples , 2017, ICLR.

[51]  Sergio Guadarrama,et al.  Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Peter L. Bartlett,et al.  Classification with a Reject Option using a Hinge Loss , 2008, J. Mach. Learn. Res..

[53]  Lujo Bauer,et al.  Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition , 2016, CCS.

[54]  Kevin Gimpel,et al.  A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[55]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[56]  Gang Sun,et al.  Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks , 2018, NeurIPS.

[57]  Balaji Lakshminarayanan,et al.  AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty , 2020, ICLR.

[58]  Jinfeng Yi,et al.  Is Robustness the Cost of Accuracy? - A Comprehensive Study on the Robustness of 18 Deep Image Classification Models , 2018, ECCV.

[59]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60]  Bolei Zhou,et al.  Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Matthias Bethge,et al.  Generalisation in humans and deep neural networks , 2018, NeurIPS.

[62]  Doug Downey,et al.  Abductive Commonsense Reasoning , 2019, ICLR.

[63]  Shuicheng Yan,et al.  Dual Path Networks , 2017, NIPS.

[64]  Thomas G. Dietterich,et al.  Deep Anomaly Detection with Outlier Exposure , 2018, ICLR.

[65]  Aaron C. Courville,et al.  Detecting semantic anomalies , 2019, AAAI.

[66]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[67]  Kah Kay Sung,et al.  Learning and example selection for object and pattern detection , 1995 .

[68]  Yi Sun,et al.  Transfer of Adversarial Robustness Between Perturbation Types , 2019, ArXiv.

[69]  Brendan T. O'Connor,et al.  Posterior calibration and exploratory analysis for natural language processing models , 2015, EMNLP.

[70]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[71]  Logan Engstrom,et al.  Synthesizing Robust Adversarial Examples , 2017, ICML.

[72]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[73]  Aleksander Madry,et al.  Identifying Statistical Bias in Dataset Replication , 2020, ICML.

[74]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[75]  Yang Song,et al.  Constructing Unrestricted Adversarial Examples with Generative Models , 2018, NeurIPS.

[76]  Kibok Lee,et al.  Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples , 2017, ICLR.

[77]  Georg Heigold,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.

[78]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[79]  Dawn Song,et al.  Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty , 2019, NeurIPS.

[80]  Aleksander Madry,et al.  On Evaluating Adversarial Robustness , 2019, ArXiv.

[81]  Quoc V. Le,et al.  Do Better ImageNet Models Transfer Better? , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[82]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[83]  Kai Zhao,et al.  Res2Net: A New Multi-Scale Backbone Architecture , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[84]  Matthias Bethge,et al.  ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , 2018, ICLR.