The Effects of Image Distribution and Task on Adversarial Robustness

In this paper, we propose an adaptation to the area under the curve (AUC) metric to measure the adversarial robustness of a model over a particular -interval [ 0, 1] (interval of adversarial perturbation strengths) that facilitates unbiased comparisons across models when they have different initial 0 performance. This can be used to determine how adversarially robust a model is to different image distributions or task (or some other variable); and/or to measure how robust a model is comparatively to other models. We used this adversarial robustness metric on models of an MNIST, CIFAR-10, and a Fusion dataset (CIFAR10 + MNIST) where trained models performed either a digit or object recognition task using a LeNet, ResNet50, or a fully connected network (FullyConnectedNet) architecture and found the following: 1) CIFAR-10 models are inherently less adversarially robust than MNIST models; 2) Both the image distribution and task that a model is trained on can affect the adversarial robustness of the resultant model. 3) Pretraining with a different image distribution and task sometimes carries over the adversarial robustness induced by that image distribution and task in the resultant model; Collectively, our results imply non-trivial differences of the learned representation space of one perceptual system over another given its exposure to different image statistics or tasks (mainly objects vs digits). Moreover, these results hold even when model systems are equalized to have the same level of performance, or when exposed to approximately matched image statistics of fusion images but with different tasks. * Denotes joint senior authorship Center for Brains, Minds and Machines (CBMM), Massachusetts Institute of Technology, Cambridge, MA, USA.. Correspondence to: Owen Kunhardt <okunhardt@owenkunhardt.com>. Under Review at the 38 th International Conference on Machine Learning, 2021. Copyright 2021 by the author(s).

[1]  Javid Sadr,et al.  Object recognition and Random Image Structure Evolution , 2004, Cogn. Sci..

[2]  Aleksander Madry,et al.  On Evaluating Adversarial Robustness , 2019, ArXiv.

[3]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[4]  Chaz Firestone,et al.  Humans can decipher adversarial images , 2018, Nature Communications.

[5]  Ekin D. Cubuk,et al.  A Fourier Perspective on Model Robustness in Computer Vision , 2019, NeurIPS.

[6]  Abhimanyu Dubey,et al.  Defense Against Adversarial Images Using Web-Scale Nearest-Neighbor Search , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Qianli Liao,et al.  Hierarchically Local Tasks and Deep Convolutional Networks , 2020, ArXiv.

[8]  Miguel P. Eckstein,et al.  Assessment of Faster R-CNN in Man-Machine Collaborative Search , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Thomas G. Dietterich,et al.  Benchmarking Neural Network Robustness to Common Corruptions and Perturbations , 2018, ICLR.

[11]  Behnam Neyshabur,et al.  Towards Learning Convolutions from Scratch , 2020, NeurIPS.

[12]  Andrzej Banburski,et al.  Biologically Inspired Mechanisms for Adversarial Robustness , 2020, NeurIPS.

[13]  Josh H. McDermott,et al.  Metamers of neural networks reveal divergence from human perceptual systems , 2019, NeurIPS.

[14]  Riegeskorte CONTROVERSIAL STIMULI: PITTING NEURAL NETWORKS AGAINST EACH OTHER AS MODELS OF HUMAN RECOGNITION , 2019 .

[15]  Abhinav Gupta,et al.  Demystifying Contrastive Self-Supervised Learning: Invariances, Augmentations and Dataset Biases , 2020, NeurIPS.

[16]  David A. Forsyth,et al.  NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles , 2017, ArXiv.

[17]  Michael I. Jordan,et al.  Theoretically Principled Trade-off between Robustness and Accuracy , 2019, ICML.

[18]  F A Wichmann,et al.  Ning for Helpful Comments and Suggestions. This Paper Benefited Con- Siderably from Conscientious Peer Review, and We Thank Our Reviewers the Psychometric Function: I. Fitting, Sampling, and Goodness of Fit , 2001 .

[19]  Olac Fuentes,et al.  On the Defense Against Adversarial Examples Beyond the Visible Spectrum , 2018, MILCOM 2018 - 2018 IEEE Military Communications Conference (MILCOM).

[20]  Matthias Bethge,et al.  On the surprising similarities between supervised and self-supervised models , 2020, ArXiv.

[21]  Aleksander Madry,et al.  Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.

[22]  Matthias Bethge,et al.  ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , 2018, ICLR.

[23]  Matthias Bethge,et al.  Excessive Invariance Causes Adversarial Vulnerability , 2018, ICLR.

[24]  Martin Schrimpf,et al.  Simulating a Primary Visual Cortex at the Front of CNNs Improves Robustness to Image Perturbations , 2020, bioRxiv.

[25]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[26]  Michael C. Frank,et al.  Unsupervised neural network models of the ventral visual stream , 2020, Proceedings of the National Academy of Sciences.

[27]  Lorenzo Rosasco,et al.  Why and when can deep-but not shallow-networks avoid the curse of dimensionality: A review , 2016, International Journal of Automation and Computing.

[28]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[29]  Yair Weiss,et al.  A Bayes-Optimal View on Adversarial Examples , 2020, J. Mach. Learn. Res..

[30]  T. Poggio,et al.  Deep vs. shallow networks : An approximation theory perspective , 2016, ArXiv.

[31]  Logan Engstrom,et al.  Black-box Adversarial Attacks with Limited Queries and Information , 2018, ICML.

[32]  Eero P. Simoncelli,et al.  The local low-dimensionality of natural images , 2014, ICLR.

[33]  Luyu Wang,et al.  On the Sensitivity of Adversarial Robustness to Input Data Distributions , 2018, ICLR.

[34]  Katherine L. Hermann,et al.  Exploring the Origins and Prevalence of Texture Bias in Convolutional Neural Networks , 2019, ArXiv.

[35]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[36]  George A. Alvarez,et al.  A self-supervised domain-general learning framework for human ventral stream representation , 2020, Nature Communications.

[37]  Kazuhiro Takemoto,et al.  Vulnerability of deep neural networks for detecting COVID-19 cases from chest X-ray images to universal adversarial attacks , 2020, PloS one.

[38]  Talia Konkle,et al.  Shape features learned for object classification can predict behavioral discrimination of written symbols , 2019 .

[39]  D. M. Green,et al.  Signal detection theory and psychophysics , 1966 .

[40]  Pan He,et al.  Adversarial Examples: Attacks and Defenses for Deep Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.