论文信息 - Clustering Effect of (Linearized) Adversarial Robust Models

Clustering Effect of (Linearized) Adversarial Robust Models

Adversarial robustness has received increasing attention along with the study of adversarial examples. So far, existing works show that robust models not only obtain robustness against various adversarial attacks but also boost the performance in some downstream tasks. However, the underlying mechanism of adversarial robustness is still not clear. In this paper, we interpret adversarial robustness from the perspective of linear components, and find that there exist some statistical properties for comprehensively robust models. Specifically, robust models show obvious hierarchical clustering effect on their linearized sub-networks, when removing or replacing all non-linear components (e.g., batch normalization, maximum pooling, or activation layers). Based on these observations, we propose a novel understanding of adversarial robustness and apply it on more tasks including domain adaption and robustness boosting. Experimental evaluations demonstrate the rationality and superiority of our proposed clustering strategy.

Shu-Tao Xia | Yang Bai | Yisen Wang | Yong Jiang | Xin Yan

[1] Ananthram Swami,et al. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[2] Zhiheng Huang,et al. Residual Convolutional CTC Networks for Automatic Speech Recognition , 2017, ArXiv.

[3] Raja Giryes,et al. Improving DNN Robustness to Adversarial Attacks using Jacobian Regularization , 2018, ECCV.

[4] Zhangyang Wang,et al. Can We Gain More from Orthogonality Regularizations in Training Deep Networks? , 2018, NeurIPS.

[5] James Bailey,et al. On the Convergence and Robustness of Adversarial Training , 2021, ICML.

[6] Xiaochun Cao,et al. ComDefend: An Efficient Image Compression Model to Defend Adversarial Examples , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7] James Bailey,et al. Improving Adversarial Robustness Requires Revisiting Misclassified Examples , 2020, ICLR.

[8] Kun Kuang,et al. Analysis and Applications of Class-wise Robustness in Adversarial Training , 2021, KDD.

[9] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[10] Antoni B. Chan,et al. Improve Generalization and Robustness of Neural Networks via Weight Scale Shifting Invariant Regularizations , 2020, ArXiv.

[11] Honglak Lee,et al. An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[12] Tong Zhang,et al. NATTACK: Learning the Distributions of Adversarial Examples for an Improved Black-Box Attack on Deep Neural Networks , 2019, ICML.

[13] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Shu-Tao Xia,et al. Improving Adversarial Robustness via Channel-wise Activation Suppressing , 2021, ICLR.

[15] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[16] James Bailey,et al. Skip Connections Matter: On the Transferability of Adversarial Examples Generated with ResNets , 2020, ICLR.

[17] Moustapha Cissé,et al. Parseval Networks: Improving Robustness to Adversarial Examples , 2017, ICML.

[18] David A. Wagner,et al. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[19] Yisen Wang,et al. Adversarial Weight Perturbation Helps Robust Generalization , 2020, NeurIPS.

[20] Thomas Hofmann,et al. Adversarial Training Generalizes Data-dependent Spectral Norm Regularization , 2019, ArXiv.

[21] Ning Qian,et al. On the momentum term in gradient descent learning algorithms , 1999, Neural Networks.

[22] Changshui Zhang,et al. Deep Defense: Training DNNs with Improved Adversarial Robustness , 2018, NeurIPS.

[23] Jia Xu,et al. Adversarial Defense Via Local Flatness Regularization , 2019, 2020 IEEE International Conference on Image Processing (ICIP).

[24] Shu-Tao Xia,et al. Improving Query Efficiency of Black-box Adversarial Attack , 2020, ECCV.

[25] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.