Adversarial Attacks Hidden in Plain Sight

Convolutional neural networks have been used to achieve a string of successes during recent years, but their lack of interpretability remains a serious issue. Adversarial examples are designed to deliberately fool neural networks into making any desired incorrect classification, potentially with very high certainty. Several defensive approaches increase robustness against adversarial attacks, demanding attacks of greater magnitude, which lead to visible artifacts. By considering human visual perception, we compose a technique that allows to hide such adversarial attacks in regions of high complexity, such that they are imperceptible even to an astute observer. We carry out a user study on classifying adversarially modified images to validate the perceptual quality of our approach and find significant evidence for its concealment with regards to human visual perception.

[1]  Aleksander Madry,et al.  Robustness May Be at Odds with Accuracy , 2018, ICLR.

[2]  Patrick D. McDaniel,et al.  On the (Statistical) Detection of Adversarial Examples , 2017, ArXiv.

[3]  Preetum Nakkiran,et al.  Adversarial Robustness May Be at Odds With Simplicity , 2019, ArXiv.

[4]  James J DiCarlo,et al.  Large-Scale, High-Resolution Comparison of the Core Visual Object Recognition Behavior of Humans, Monkeys, and State-of-the-Art Deep Artificial Neural Networks , 2018, The Journal of Neuroscience.

[5]  S. Sawilowsky New Effect Size Rules of Thumb , 2009 .

[6]  Moustapha Cissé,et al.  Parseval Networks: Improving Robustness to Adversarial Examples , 2017, ICML.

[7]  M. Ibbotson,et al.  Visual perception and saccadic eye movements , 2011, Current Opinion in Neurobiology.

[8]  D. Bacciu,et al.  Detecting Adversarial Examples through Nonlinear Dimensionality Reduction. , 2019 .

[9]  M. Carrasco Visual attention: The past 25 years , 2011, Vision Research.

[10]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[11]  Qi Zhao,et al.  Foveation-based Mechanisms Alleviate Adversarial Examples , 2015, ArXiv.

[12]  Aleksander Madry,et al.  Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.

[13]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[14]  Debdeep Mukhopadhyay,et al.  Adversarial Attacks and Defences: A Survey , 2018, ArXiv.

[15]  Dorothea Kolossa,et al.  Adversarial Attacks Against Automatic Speech Recognition Systems via Psychoacoustic Hiding , 2018, NDSS.

[16]  Jan Hendrik Metzen,et al.  On Detecting Adversarial Perturbations , 2017, ICLR.

[17]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[18]  Ananthram Swami,et al.  Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples , 2016, ArXiv.

[19]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[21]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[22]  Ryan R. Curtin,et al.  Detecting Adversarial Samples from Artifacts , 2017, ArXiv.

[23]  Bruno A. Olshausen,et al.  Scene analysis in the natural environment , 2014, Front. Psychol..

[24]  F. Jäkel,et al.  An overview of quantitative approaches in Gestalt perception , 2016, Vision Research.

[25]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[27]  Heiko Wersing,et al.  A Competitive-Layer Model for Feature Binding and Sensory Segmentation , 2001, Neural Computation.

[28]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[29]  J. DiCarlo,et al.  Using goal-driven deep learning models to understand sensory cortex , 2016, Nature Neuroscience.

[30]  Matthias Bethge,et al.  Foolbox v0.8.0: A Python toolbox to benchmark the robustness of machine learning models , 2017, ArXiv.

[31]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[32]  Kouichi Sakurai,et al.  One Pixel Attack for Fooling Deep Neural Networks , 2017, IEEE Transactions on Evolutionary Computation.

[33]  Jascha Sohl-Dickstein,et al.  Adversarial Examples that Fool both Computer Vision and Time-Limited Humans , 2018, NeurIPS.

[34]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[35]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[36]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[37]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.