Detection Based Defense Against Adversarial Examples From the Steganalysis Point of View

Deep Neural Networks (DNNs) have recently led to significant improvements in many fields. However, DNNs are vulnerable to adversarial examples which are samples with imperceptible perturbations while dramatically misleading the DNNs. Moreover, adversarial examples can be used to perform an attack on various kinds of DNN based systems, even if the adversary has no access to the underlying model. Many defense methods have been proposed, such as obfuscating gradients of the networks or detecting adversarial examples. However it is proved out that these defense methods are not effective or cannot resist secondary adversarial attacks. In this paper, we point out that steganalysis can be applied to adversarial examples detection, and propose a method to enhance steganalysis features by estimating the probability of modifications caused by adversarial attacks. Experimental results show that the proposed method can accurately detect adversarial examples. Moreover, secondary adversarial attacks are hard to be directly performed to our method because our method is not based on a neural network but based on high-dimensional artificial features and Fisher Linear Discriminant ensemble.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Li Chen,et al.  Keeping the Bad Guys Out: Protecting and Vaccinating Deep Learning with JPEG Compression , 2017, ArXiv.

[3]  Jessica J. Fridrich,et al.  Ensemble Classifiers for Steganalysis of Digital Media , 2012, IEEE Transactions on Information Forensics and Security.

[4]  Xiaofeng Wang,et al.  Detecting Adversarial Image Examples in Deep Neural Networks with Adaptive Noise Reduction , 2017, IEEE Transactions on Dependable and Secure Computing.

[5]  Yanjun Qi,et al.  Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks , 2017, NDSS.

[6]  Jianping Fan,et al.  iPrivacy: Image Privacy Protection by Identifying Sensitive Objects via Deep Multi-Task Learning , 2017, IEEE Transactions on Information Forensics and Security.

[7]  Terrance E. Boult,et al.  Are Accuracy and Robustness Correlated , 2016, 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA).

[8]  Jessica J. Fridrich,et al.  Selection-channel-aware rich model for Steganalysis of digital images , 2014, 2014 IEEE International Workshop on Information Forensics and Security (WIFS).

[9]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[10]  Yang Song,et al.  Improving the Robustness of Deep Neural Networks via Stability Training , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[12]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[13]  B. S. Manjunath,et al.  Steganalysis of spread spectrum data hiding exploiting cover memory , 2005, IS&T/SPIE Electronic Imaging.

[14]  Marco Zennaro,et al.  Large-scale privacy protection in Google Street View , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[16]  Luca Rigazio,et al.  Towards Deep Neural Network Architectures Robust to Adversarial Examples , 2014, ICLR.

[17]  Jessica J. Fridrich,et al.  Rich Models for Steganalysis of Digital Images , 2012, IEEE Transactions on Information Forensics and Security.

[18]  Terrance E. Boult,et al.  Adversarial Diversity and Hard Positive Generation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[19]  Zhitao Gong,et al.  Adversarial and Clean Data Are Not Twins , 2017, aiDM@SIGMOD.

[20]  Jack W. Stokes,et al.  Large-scale malware classification using random projections and neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[21]  Jan Hendrik Metzen,et al.  On Detecting Adversarial Perturbations , 2017, ICLR.

[22]  Kevin Gimpel,et al.  Early Methods for Detecting Adversarial Images , 2016, ICLR.

[23]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Ryan R. Curtin,et al.  Detecting Adversarial Samples from Artifacts , 2017, ArXiv.

[25]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[26]  Daniel Cullina,et al.  Enhancing robustness of machine learning systems via data transformations , 2017, 2018 52nd Annual Conference on Information Sciences and Systems (CISS).

[27]  David A. Forsyth,et al.  SafetyNet: Detecting and Rejecting Adversarial Examples Robustly , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[29]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[30]  Xin Li,et al.  Adversarial Examples Detection in Deep Networks with Convolutional Filter Statistics , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[31]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[32]  Chong Wang,et al.  Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.

[33]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[34]  Prateek Mittal,et al.  Dimensionality Reduction as a Defense against Evasion Attacks on Machine Learning Classifiers , 2017, ArXiv.

[35]  Patrick D. McDaniel,et al.  On the (Statistical) Detection of Adversarial Examples , 2017, ArXiv.

[36]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Hao Chen,et al.  MagNet: A Two-Pronged Defense against Adversarial Examples , 2017, CCS.

[38]  Tomás Pevný,et al.  Steganalysis by Subtractive Pixel Adjacency Matrix , 2009, IEEE Transactions on Information Forensics and Security.

[39]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[40]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).