Most ReLU Networks Suffer from 𝓁2 Adversarial Perturbations

We consider ReLU networks with random weights, in which the dimension decreases at each layer. We show that for most such networks, most examples x admit an adversarial perturbation at an Euclidean distance of O ( ‖x‖ √ d ) , where d is the input dimension. Moreover, this perturbation can be found via gradient flow, as well as gradient descent with sufficiently small steps. This result can be seen as an explanation to the abundance of adversarial examples, and to the fact that they are found via gradient descent.

[1]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[2]  Ilya P. Razenshteyn,et al.  Adversarial examples from computational constraints , 2018, ICML.

[3]  Adi Shamir,et al.  A Simple Explanation for the Existence of Adversarial Examples with Small Hamming Distance , 2019, ArXiv.

[4]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[5]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[6]  Aleksander Madry,et al.  Adversarially Robust Generalization Requires More Data , 2018, NeurIPS.

[7]  J. Zico Kolter,et al.  Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[8]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[9]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[10]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[11]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[12]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[13]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[14]  David A. Wagner,et al.  Audio Adversarial Examples: Targeted Attacks on Speech-to-Text , 2018, 2018 IEEE Security and Privacy Workshops (SPW).

[15]  Hamza Fawzi,et al.  Adversarial vulnerability for any classifier , 2018, NeurIPS.

[16]  Ryan R. Curtin,et al.  Detecting Adversarial Samples from Artifacts , 2017, ArXiv.

[17]  Patrick D. McDaniel,et al.  On the (Statistical) Detection of Adversarial Examples , 2017, ArXiv.

[18]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[19]  R. Handel Probability in High Dimension , 2014 .