论文信息 - Mitigating large adversarial perturbations on X-MAS (X minus Moving Averaged Samples)

Mitigating large adversarial perturbations on X-MAS (X minus Moving Averaged Samples)

We propose the scheme that mitigates the adversarial perturbation $\epsilon$ on the adversarial example $X_{adv}$ ($=$ $X$ $\pm$ $\epsilon$, $X$ is a benign sample) by subtracting the estimated perturbation $\hat{\epsilon}$ from $X$ $+$ $\epsilon$ and adding $\hat{\epsilon}$ to $X$ $-$ $\epsilon$. The estimated perturbation $\hat{\epsilon}$ comes from the difference between $X_{adv}$ and its moving-averaged outcome $W_{avg}*X_{adv}$ where $W_{avg}$ is $N \times N$ moving average kernel that all the coefficients are one. Usually, the adjacent samples of an image are close to each other such that we can let $X$ $\approx$ $W_{avg}*X$ (naming this relation after X-MAS[X minus Moving Averaged Samples]). By doing that, we can make the estimated perturbation $\hat{\epsilon}$ falls within the range of $\epsilon$. The scheme is also extended to do the multi-level mitigation by configuring the mitigated adversarial example $X_{adv}$ $\pm$ $\hat{\epsilon}$ as a new adversarial example to be mitigated. The multi-level mitigation gets $X_{adv}$ closer to $X$ with a smaller (i.e. mitigated) perturbation than original unmitigated perturbation by setting the moving averaged adversarial sample $W_{avg} * X_{adv}$ (which has the smaller perturbation than $X_{adv}$ if $X$ $\approx$ $W_{avg}*X$) as the boundary condition that the multi-level mitigation cannot cross over (i.e. decreasing $\epsilon$ cannot go below and increasing $\epsilon$ cannot go beyond). With the multi-level mitigation, we can get high prediction accuracies even in the adversarial example having a large perturbation (i.e. $\epsilon$ $>$ $16$). The proposed scheme is evaluated with adversarial examples crafted by the FGSM (Fast Gradient Sign Method) based attacks on ResNet-50 trained with ImageNet dataset.

Inyup Kang | Woohyung Chun | Sung-Min Hong | Junho Huh

[1] Samy Bengio,et al. Adversarial examples in the physical world , 2016, ICLR.

[2] Colin Raffel,et al. Thermometer Encoding: One Hot Way To Resist Adversarial Examples , 2018, ICLR.

[3] David A. Wagner,et al. Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[4] Zoubin Ghahramani,et al. A study of the effect of JPG compression on adversarial images , 2016, ArXiv.

[5] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[6] Marcus A. Brubaker,et al. On the Effectiveness of Low Frequency Perturbations , 2019, IJCAI.

[7] Moustapha Cissé,et al. Countering Adversarial Images using Input Transformations , 2018, ICLR.

[8] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[9] Kilian Q. Weinberger,et al. Low Frequency Adversarial Perturbation , 2018, UAI.

[10] James A. Storer,et al. Deflecting Adversarial Attacks with Pixel Deflection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11] Samy Bengio,et al. Adversarial Machine Learning at Scale , 2016, ICLR.

[12] Ananthram Swami,et al. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[13] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).