Biologically Inspired Mechanisms for Adversarial Robustness

A convolutional neural network strongly robust to adversarial perturbations at reasonable computational and performance cost has not yet been demonstrated. The primate visual ventral stream seems to be robust to small perturbations in visual stimuli but the underlying mechanisms that give rise to this robust perception are not understood. In this work, we investigate the role of two biologically plausible mechanisms in adversarial robustness. We demonstrate that the non-uniform sampling performed by the primate retina and the presence of multiple receptive fields with a range of receptive field sizes at each eccentricity improve the robustness of neural networks to small adversarial perturbations. We verify that these two mechanisms do not suffer from gradient obfuscation and study their contribution to adversarial robustness through ablation studies.

[1]  Eero P. Simoncelli,et al.  Metamers of the ventral stream , 2011, Nature Neuroscience.

[2]  Matthias Bethge,et al.  Towards the first adversarially robust neural network model on MNIST , 2018, ICLR.

[3]  Yanjun Qi,et al.  Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks , 2017, NDSS.

[4]  Alan L. Yuille,et al.  Mitigating adversarial effects through randomization , 2017, ICLR.

[5]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[6]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[7]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[8]  Jonas Kubilius,et al.  Brain-Score: Which Artificial Neural Network for Object Recognition is most Brain-Like? , 2018, bioRxiv.

[9]  W. Brendel,et al.  Foolbox: A Python toolbox to benchmark the robustness of machine learning models , 2017 .

[10]  C. Gross,et al.  Visuotopic organization and extent of V3 and V4 of the macaque , 1988, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[11]  Inderjit S. Dhillon,et al.  The Limitations of Adversarial Training and the Blind-Spot Attack , 2019, ICLR.

[12]  Tomaso A. Poggio,et al.  Computational role of eccentricity dependent cortical magnification , 2014, ArXiv.

[13]  Aleksander Madry,et al.  Robustness May Be at Odds with Accuracy , 2018, ICLR.

[14]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[16]  Gemma Roig,et al.  Scale and translation-invariance for novel objects in human vision , 2020, Scientific Reports.

[17]  C. Gross,et al.  Visual topography of V2 in the macaque , 1981, The Journal of comparative neurology.

[18]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[19]  Dawn Xiaodong Song,et al.  Adversarial Example Defenses: Ensembles of Weak Defenses are not Strong , 2017, ArXiv.

[20]  Logan Engstrom,et al.  Synthesizing Robust Adversarial Examples , 2017, ICML.

[21]  Alan L. Yuille,et al.  Feature Denoising for Improving Adversarial Robustness , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  D. Marr,et al.  Smallest channel in early human vision. , 1980, Journal of the Optical Society of America.

[23]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[24]  Valero Laparra,et al.  Eigen-Distortions of Hierarchical Representations , 2017, NIPS.

[25]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[26]  Xin Li,et al.  Adversarial Examples Detection in Deep Networks with Convolutional Filter Statistics , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[27]  Tomaso A. Poggio,et al.  Eccentricity Dependent Deep Neural Networks: Modeling Invariance in Human Vision , 2017, AAAI Spring Symposia.

[28]  Tomaso A. Poggio,et al.  Do Deep Neural Networks Suffer from Crowding? , 2017, NIPS.

[29]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[30]  Aleksander Madry,et al.  Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.

[31]  Aleksander Madry,et al.  Image Synthesis with a Single (Robust) Classifier , 2019, NeurIPS.

[32]  Aleksander Madry,et al.  Adversarial Robustness as a Prior for Learned Representations , 2019 .

[33]  Martha Larson,et al.  Towards Large Yet Imperceptible Adversarial Image Perturbations With Perceptual Color Distance , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Clark W. Barrett,et al.  Provably Minimally-Distorted Adversarial Examples , 2017 .

[35]  Matthias Bethge,et al.  Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models , 2017, ICLR.

[36]  James J DiCarlo,et al.  Neural population control via deep image synthesis , 2018, Science.

[37]  Jascha Sohl-Dickstein,et al.  Adversarial Examples that Fool both Computer Vision and Time-Limited Humans , 2018, NeurIPS.

[38]  Pushmeet Kohli,et al.  Adversarial Risk and the Dangers of Evaluating Against Weak Attacks , 2018, ICML.

[39]  Aleksander Madry,et al.  On Evaluating Adversarial Robustness , 2019, ArXiv.

[40]  J. DiCarlo,et al.  Using goal-driven deep learning models to understand sensory cortex , 2016, Nature Neuroscience.

[41]  Matthias Bethge,et al.  Foolbox v0.8.0: A Python toolbox to benchmark the robustness of machine learning models , 2017, ArXiv.