论文信息 - Defensive Collaborative Multi-task Training - Defending against Adversarial Attack towards Deep Neural Networks

Defensive Collaborative Multi-task Training - Defending against Adversarial Attack towards Deep Neural Networks

Deep neural network (DNNs) has shown impressive performance on hard perceptual problems. However, researchers found that DNN-based systems are vulnerable to adversarial examples which contain specially crafted humans-imperceptible perturbations. Such perturbations cause DNN-based systems to misclassify the adversarial examples, with potentially disastrous consequences where safety or security is crucial. As a major security concern, state-of-the-art attacks can still bypass the existing defensive methods. In this paper, we propose a novel defensive framework based on collaborative multi-task training to address the above problem. The proposed defence first incorporates specific label pairs into adversarial training process to enhance model robustness in black-box setting. Then a novel collaborative multi-task training framework is proposed to construct a detector which identifies adversarial examples based on the pairwise relationship of the label pairs. The detector can identify and reject high confidence adversarial examples that bypass black-box defence. The model whose robustness has been enhanced work reciprocally with the detector on the false-negative adversarial examples. Importantly, the proposed collaborative architecture can prevent the adversary from finding valid adversarial examples in a nearly-white-box setting.

[1] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2] James Bailey,et al. Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality , 2018, ICLR.

[3] Jiajun Lu,et al. Adversarial Examples that Fool Detectors , 2017, ArXiv.

[4] Alan L. Yuille,et al. Adversarial Examples for Semantic Segmentation and Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[5] Gang Hua,et al. A convolutional neural network cascade for face detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Logan Engstrom,et al. Synthesizing Robust Adversarial Examples , 2017, ICML.

[8] Dawn Xiaodong Song,et al. Adversarial Example Defenses: Ensembles of Weak Defenses are not Strong , 2017, ArXiv.

[9] Martín Abadi,et al. Adversarial Patch , 2017, ArXiv.

[10] Jan Hendrik Metzen,et al. On Detecting Adversarial Perturbations , 2017, ICLR.

[11] Yanjun Qi,et al. Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks , 2017, NDSS.

[12] David Wagner,et al. Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[13] Jitendra Malik,et al. Region-Based Convolutional Networks for Accurate Object Detection and Segmentation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[15] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[16] Ananthram Swami,et al. Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[17] Patrick D. McDaniel,et al. Adversarial Examples for Malware Detection , 2017, ESORICS.

[18] Uri Shaham,et al. Understanding adversarial training: Increasing local stability of supervised models through robust optimization , 2015, Neurocomputing.

[19] Dan Boneh,et al. Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[20] Samy Bengio,et al. Adversarial examples in the physical world , 2016, ICLR.

[21] Radha Poovendran,et al. Semantic Adversarial Examples , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[22] Matthias Bethge,et al. Foolbox v0.8.0: A Python toolbox to benchmark the robustness of machine learning models , 2017, ArXiv.

[23] Ryan R. Curtin,et al. Detecting Adversarial Samples from Artifacts , 2017, ArXiv.

[24] Percy Liang,et al. Adversarial Examples for Evaluating Reading Comprehension Systems , 2017, EMNLP.

[25] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[26] Seyed-Mohsen Moosavi-Dezfooli,et al. DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Kamyar Azizzadenesheli,et al. Stochastic Activation Pruning for Robust Adversarial Defense , 2018, ICLR.

[28] Massimiliano Pontil,et al. Regularized multi--task learning , 2004, KDD.

[29] Ananthram Swami,et al. The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[30] David A. Wagner,et al. MagNet and "Efficient Defenses Against Adversarial Attacks" are Not Robust to Adversarial Examples , 2017, ArXiv.

[31] Ananthram Swami,et al. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[32] Luca Rigazio,et al. Towards Deep Neural Network Architectures Robust to Adversarial Examples , 2014, ICLR.

[33] Patrick D. McDaniel,et al. On the (Statistical) Detection of Adversarial Examples , 2017, ArXiv.

[34] Hao Chen,et al. MagNet: A Two-Pronged Defense against Adversarial Examples , 2017, CCS.

[35] Dawn Xiaodong Song,et al. Delving into Transferable Adversarial Examples and Black-box Attacks , 2016, ICLR.

[36] Yang Song,et al. PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples , 2017, ICLR.

[37] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] David A. Forsyth,et al. SafetyNet: Detecting and Rejecting Adversarial Examples Robustly , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[39] David A. Wagner,et al. Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[40] Aleksander Madry,et al. Exploring the Landscape of Spatial Robustness , 2017, ICML.