DIPDefend: Deep Image Prior Driven Defense against Adversarial Examples

Deep neural networks (DNNs) have shown serious vulnerability to adversarial examples with imperceptible perturbation to clean images. Most existing input-transformation based defense methods (e.g., ComDefend) rely heavily on the learned external priors from an external large training dataset, while neglecting the rich image internal priors of the input itself, thus limiting the generalization of the defense models against the adversarial examples with biased image statistics from the external training dataset. Motivated by deep image prior that can capture rich image statistics from a single image, we propose an effective Deep Image Prior Driven Defense (DIPDefend) method against adversarial examples. With a DIP generator to fit the target/adversarial input, we find that our image reconstruction exhibits quite interesting learning preference from a feature learning perspectives, i.e., the early stage primarily learns the robust features resistant to adversarial perturbation, followed by learning non-robust features that are sensitive to adversarial perturbation. Besides, we develop an adaptive stopping strategy that adapts our method to diverse images. In this way, the proposed model obtains a unique defender for each individual adversarial input, thus being robust to various attackers. Experimental results demonstrate the superiority of our method over the state-of-the-art defense methods against white-box and black-box adversarial attacks.

[1]  Jun Zhu,et al.  Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[3]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[4]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[5]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[6]  Jun Zhu,et al.  Boosting Adversarial Attacks with Momentum , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[8]  Aleksander Madry,et al.  Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.

[9]  Alan L. Yuille,et al.  Feature Denoising for Improving Adversarial Robustness , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Li Fei-Fei,et al.  Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Yan Feng,et al.  Hilbert-Based Generative Defense for Adversarial Examples , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Tao Liu,et al.  Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Vineeth N. Balasubramanian,et al.  Harnessing the Vulnerability of Latent Layers in Adversarially Trained Models , 2019, IJCAI.

[14]  Aleksander Madry,et al.  Learning Perceptually-Aligned Representations via Adversarial Robustness , 2019, ArXiv.

[15]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[16]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[17]  Xiaolin Hu,et al.  Defense Against Adversarial Attacks Using High-Level Representation Guided Denoiser , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Mingyan Liu,et al.  Spatially Transformed Adversarial Examples , 2018, ICLR.

[19]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[20]  Subhransu Maji,et al.  A Bayesian Perspective on the Deep Image Prior , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Paul Goodwin,et al.  The Holt-Winters Approach to Exponential Smoothing: 50 Years Old and Going Strong , 2010 .

[22]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23]  Aleksander Madry,et al.  Robustness May Be at Odds with Accuracy , 2018, ICLR.

[24]  Rama Chellappa,et al.  Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models , 2018, ICLR.

[25]  Yang Song,et al.  PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples , 2017, ICLR.

[26]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[27]  Yanjun Qi,et al.  Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks , 2017, NDSS.

[28]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[29]  Michal Irani,et al.  “Double-DIP”: Unsupervised Image Decomposition via Coupled Deep-Image-Priors , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[31]  Alan L. Yuille,et al.  Improving Transferability of Adversarial Examples With Input Diversity , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Xiaochun Cao,et al.  ComDefend: An Efficient Image Compression Model to Defend Adversarial Examples , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Ziming Zhang,et al.  White-Box Adversarial Defense via Self-Supervised Data Estimation , 2019, ArXiv.

[34]  Andrea Vedaldi,et al.  Deep Image Prior , 2017, International Journal of Computer Vision.

[35]  Moustapha Cissé,et al.  Countering Adversarial Images using Input Transformations , 2018, ICLR.

[36]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[38]  Shu-Tao Xia,et al.  Second-Order Attention Network for Single Image Super-Resolution , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).