Gradient-Based Adversarial Image Forensics

Adversarial images which can fool deep neural networks attract researchers’ attentions to the security of machine learning. In this paper, we employ a blind forensic method to detect adversarial images which are generated by the gradient-based attacks including FGSM, BIM, RFGSM and PGD. Through analyzing adversarial images, we find out that the gradient-based attacks cause significant statistical changes in the image difference domain. Besides, the gradient-based attacks add different perturbations on R, G, B channels, which inevitably change the dependencies among R, G, B channels. To measure those dependencies, the \(3^{rd}\)-order co-occurrence is employed to construct the feature. Unlike previous works which extract the co-occurrence within each channel, we extract the co-occurrences across from the \(1^{st}\)-order difference of R, G, B channels to capture the inter dependence changes. Due to the shift of difference elements caused by attacks, some co-occurrence elements of the adversarial images have distinct larger values than those of legitimate images. Experimental results demonstrate that the proposed method performs stable for different attack types and different attack strength, and achieves detection accuracy up to 99.9% which exceeding state-of-the-art much.

[1]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Qingjie Zhao,et al.  Detecting adversarial examples via prediction difference for deep neural networks , 2019, Inf. Sci..

[3]  Kevin Gimpel,et al.  Early Methods for Detecting Adversarial Images , 2016, ICLR.

[4]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[5]  Rainer Böhme,et al.  Detecting Adversarial Examples - a Lesson from Multimedia Security , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).

[6]  Xiaolin Hu,et al.  Defense Against Adversarial Attacks Using High-Level Representation Guided Denoiser , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Jessica J. Fridrich,et al.  Rich Models for Steganalysis of Digital Images , 2012, IEEE Transactions on Information Forensics and Security.

[8]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[9]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[10]  David A. Forsyth,et al.  SafetyNet: Detecting and Rejecting Adversarial Examples Robustly , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[11]  Hao Chen,et al.  MagNet: A Two-Pronged Defense against Adversarial Examples , 2017, CCS.

[12]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[13]  Zhi Liu,et al.  Integration of statistical detector and Gaussian noise injection detector for adversarial example detection in deep neural networks , 2019, Multimedia Tools and Applications.

[14]  Z. Jane Wang,et al.  Median Filtering Forensics Based on Convolutional Neural Networks , 2015, IEEE Signal Processing Letters.

[15]  Jessica J. Fridrich,et al.  Ensemble Classifiers for Steganalysis of Digital Media , 2012, IEEE Transactions on Information Forensics and Security.

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Xiaofeng Wang,et al.  Detecting Adversarial Image Examples in Deep Neural Networks with Adaptive Noise Reduction , 2017, IEEE Transactions on Dependable and Secure Computing.

[18]  Xin Li,et al.  Adversarial Examples Detection in Deep Networks with Convolutional Filter Statistics , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Belhassen Bayar,et al.  Constrained Convolutional Neural Networks: A New Approach Towards General Purpose Image Manipulation Detection , 2018, IEEE Transactions on Information Forensics and Security.

[20]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[21]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[22]  Jan Hendrik Metzen,et al.  On Detecting Adversarial Perturbations , 2017, ICLR.

[23]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[24]  Jessica J. Fridrich,et al.  Rich model for Steganalysis of color images , 2014, 2014 IEEE International Workshop on Information Forensics and Security (WIFS).

[25]  Yanjun Qi,et al.  Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks , 2017, NDSS.

[26]  Tomás Pevný,et al.  Steganalysis by subtractive pixel adjacency matrix , 2010, IEEE Trans. Inf. Forensics Secur..

[27]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.