DeT: Defending Against Adversarial Examples via Decreasing Transferability

Deep neural networks (DNNs) have made great progress in recent years. Unfortunately, DNNs are found to be vulnerable to adversarial examples that are injected with elaborately crafted perturbations. In this paper, we propose a defense method named DeT, which can (1) defend against adversarial examples generated by common attacks, and (2) correctly label adversarial examples with both small and large perturbations. DeT is a transferability-based defense method, which to the best of our knowledge is the first such attempt. Our experimental results demonstrate that DeT can work well under both black and gray box attacks. We hope that DeT will be a benchmark in the research community for measuring DNN attacks.

[1]  Ting Wang,et al.  Interpretable Deep Learning under Fire , 2018, USENIX Security Symposium.

[2]  Yan Xu,et al.  Deep learning of feature representation with multiple instance learning for medical image analysis , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[4]  Hao Chen,et al.  MagNet: A Two-Pronged Defense against Adversarial Examples , 2017, CCS.

[5]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[6]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Mansoor Alam,et al.  A Deep Learning Approach for Network Intrusion Detection System , 2016, EAI Endorsed Trans. Security Safety.

[9]  Patrick D. McDaniel,et al.  Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.

[10]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[11]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[12]  Ting Wang,et al.  SirenAttack: Generating Adversarial Audio for End-to-End Acoustic Systems , 2019, AsiaCCS.

[13]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[14]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[15]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[16]  Ting Wang,et al.  DEEPSEC: A Uniform Platform for Security Analysis of Deep Learning Model , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[17]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[18]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[19]  Chunming Wu,et al.  Adversarial Examples versus Cloud-Based Detectors: A Black-Box Empirical Study , 2019, IEEE Transactions on Dependable and Secure Computing.

[20]  Xiaoyu Cao,et al.  Mitigating Evasion Attacks to Deep Neural Networks via Region-based Classification , 2017, ACSAC.

[21]  Ting Wang,et al.  TextBugger: Generating Adversarial Text Against Real-world Applications , 2018, NDSS.

[22]  John Cavazos,et al.  HADM: Hybrid Analysis for Detection of Malware , 2016, IntelliSys.

[23]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.