SparseFool: A Few Pixels Make a Big Difference

Deep Neural Networks have achieved extraordinary results on image classification tasks, but have been shown to be vulnerable to attacks with carefully crafted perturbations of the input data. Although most attacks usually change values of many image's pixels, it has been shown that deep networks are also vulnerable to sparse alterations of the input. However, no computationally efficient method has been proposed to compute sparse perturbations. In this paper, we exploit the low mean curvature of the decision boundary, and propose SparseFool, a geometry inspired sparse attack that controls the sparsity of the perturbations. Extensive evaluations show that our approach computes sparse perturbations very fast, and scales efficiently to high dimensional data. We further analyze the transferability and the visual effects of the perturbations, and show the existence of shared semantic information across the images and the networks. Finally, we show that adversarial training can only slightly improve the robustness against sparse additive perturbations computed with SparseFool.

[1]  Thomas Brox,et al.  Universal Adversarial Perturbations Against Semantic Image Segmentation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[2]  Moustapha Cissé,et al.  Houdini: Fooling Deep Structured Prediction Models , 2017, ArXiv.

[3]  Stefano Soatto,et al.  Empirical Study of the Topology and Geometry of Deep Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Balas K. Natarajan,et al.  Sparse Approximate Solutions to Linear Systems , 1995, SIAM J. Comput..

[5]  Kibok Lee,et al.  Augmenting Supervised Neural Networks with Unsupervised Objectives for Large-scale Image Classification , 2016, ICML.

[6]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[7]  David A. Wagner,et al.  Audio Adversarial Examples: Targeted Attacks on Speech-to-Text , 2018, 2018 IEEE Security and Privacy Workshops (SPW).

[8]  Rémi Gribonval,et al.  Sparse representations in unions of bases , 2003, IEEE Trans. Inf. Theory.

[9]  Cynthia Brown With Friends Like These , 1985 .

[10]  Philip H. S. Torr,et al.  With Friends Like These, Who Needs Adversaries? , 2018, NeurIPS.

[11]  Daniel E. Quevedo,et al.  Sparse Packetized Predictive Control for Networked Control Over Erasure Channels , 2013, IEEE Transactions on Automatic Control.

[12]  Dawn Xiaodong Song,et al.  Delving into Transferable Adversarial Examples and Black-box Attacks , 2016, ICLR.

[13]  Michael Elad,et al.  Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Universal Adversarial Perturbations , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[17]  Matthias Hein,et al.  Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation , 2017, NIPS.

[18]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Ananthram Swami,et al.  Crafting adversarial input sequences for recurrent neural networks , 2016, MILCOM 2016 - 2016 IEEE Military Communications Conference.

[20]  Ming-Yu Liu,et al.  Tactics of Adversarial Attack on Deep Reinforcement Learning Agents , 2017, IJCAI.

[21]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[22]  Nina Narodytska,et al.  Simple Black-Box Adversarial Attacks on Deep Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[23]  Kouichi Sakurai,et al.  One Pixel Attack for Fooling Deep Neural Networks , 2017, IEEE Transactions on Evolutionary Computation.

[24]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[25]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[26]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[27]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[28]  T. Blumensath,et al.  Iterative Thresholding for Sparse Approximations , 2008 .

[29]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[30]  Terrance E. Boult,et al.  Are facial attributes adversarially robust? , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[31]  Mila Nikolova,et al.  Description of the Minimizers of Least Squares Regularized with 퓁0-norm. Uniqueness of the Global Minimizer , 2013, SIAM J. Imaging Sci..

[32]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[33]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Kouichi Sakurai,et al.  Attacking convolutional neural network using differential evolution , 2018, IPSJ Transactions on Computer Vision and Applications.

[35]  Emmanuel J. Candès,et al.  Error correction via linear programming , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[36]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[37]  W. Brendel,et al.  Foolbox: A Python toolbox to benchmark the robustness of machine learning models , 2017 .

[38]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[39]  Eduardo Valle,et al.  Adversarial Images for Variational Autoencoders , 2016, ArXiv.

[40]  Matthias Bethge,et al.  Foolbox v0.8.0: A Python toolbox to benchmark the robustness of machine learning models , 2017, ArXiv.

[41]  Pascal Frossard,et al.  Manitest: Are classifiers really invariant? , 2015, BMVC.

[42]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Ion Necoara,et al.  Random Coordinate Descent Methods for $\ell_{0}$ Regularized Convex Optimization , 2014, IEEE Transactions on Automatic Control.

[44]  Bernard Ghanem,et al.  Analytic Expressions for Probabilistic Moments of PL-DNN with Gaussian Input , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  E Shepherd,et al.  With friends like these.... , 1999, Nursing times.

[46]  Terrance E. Boult,et al.  Facial Attributes: Accuracy and Adversarial Robustness , 2017, Pattern Recognit. Lett..

[47]  Mingyan Liu,et al.  Spatially Transformed Adversarial Examples , 2018, ICLR.

[48]  S. Frick,et al.  Compressed Sensing , 2014, Computer Vision, A Reference Guide.

[49]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[50]  Seyed-Mohsen Moosavi-Dezfooli,et al.  The Robustness of Deep Networks: A Geometrical Perspective , 2017, IEEE Signal Processing Magazine.

[51]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Geometric Robustness of Deep Networks: Analysis and Improvement , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[52]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[53]  Apostol Natsev,et al.  YouTube-8M: A Large-Scale Video Classification Benchmark , 2016, ArXiv.