论文信息 - Probabilistic Margins for Instance Reweighting in Adversarial Training

Probabilistic Margins for Instance Reweighting in Adversarial Training

Reweighting adversarial data during training has been recently shown to improve adversarial robustness, where data closer to the current decision boundaries are regarded as more critical and given larger weights. However, existing methods measuring the closeness are not very reliable: they are discrete and can take only a few values, and they are path-dependent, i.e., they may change given the same start and end points with different attack paths. In this paper, we propose three types of probabilistic margin (PM), which are continuous and path-independent, for measuring the aforementioned closeness and reweighting adversarial data. Specifically, a PM is defined as the difference between two estimated class-posterior probabilities, e.g., such a probability of the true label minus the probability of the most confusing label given some natural data. Though different PMs capture different geometric properties, all three PMs share a negative correlation with the vulnerability of data: data with larger/smaller PMs are safer/riskier and should have smaller/larger weights. Experiments demonstrated that PMs are reliable and PM-based reweighting methods outperformed state-of-the-art counterparts.

[1] Bram van Ginneken,et al. A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[2] Nicolas Flammarion,et al. Understanding and Improving Fast Adversarial Training , 2020, NeurIPS.

[3] Xinbo Gao,et al. Towards Defending against Adversarial Examples via Attack-Invariant Features , 2021, ICML.

[4] Tianlong Chen,et al. PCAL: A Privacy-preserving Intelligent Credit Risk Modeling Framework Based on Adversarial Learning , 2020, ArXiv.

[5] Dongdong Hou,et al. Semantic-Transferable Weakly-Supervised Endoscopic Lesions Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[6] Logan Engstrom,et al. Synthesizing Robust Adversarial Examples , 2017, ICML.

[7] Shu-Tao Xia,et al. Improving Query Efficiency of Black-box Adversarial Attack , 2020, ECCV.

[8] Prateek Mittal,et al. PAC-learning in the presence of evasion adversaries , 2018, NIPS 2018.

[9] Stefano Soatto,et al. Empirical Study of the Topology and Geometry of Deep Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10] Ya Li,et al. Dual-Path Distillation: A Unified Framework to Improve Black-Box Attacks , 2020, ICML.

[11] Yu Cheng,et al. Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Shixin Cheng,et al. Dynamic learning rate optimization of the backpropagation algorithm , 1995, IEEE Trans. Neural Networks.

[13] Matthias Hein,et al. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks , 2020, ICML.

[14] Debdeep Mukhopadhyay,et al. Adversarial Attacks and Defences: A Survey , 2018, ArXiv.

[15] Aleksander Madry,et al. Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.

[16] Tianlong Chen,et al. Once-for-All Adversarial Training: In-Situ Tradeoff between Robustness and Accuracy for Free , 2020, NeurIPS.

[17] Andrew L. Beam,et al. Adversarial attacks on medical machine learning , 2019, Science.

[18] Aleksander Madry,et al. Robustness May Be at Odds with Accuracy , 2018, ICLR.

[19] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.

[20] Mohan S. Kankanhalli,et al. Attacks Which Do Not Kill Training Make Adversarial Learning Stronger , 2020, ICML.

[21] David A. Wagner,et al. Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[22] Dawn Xiaodong Song,et al. Practical Black-Box Attacks on Deep Neural Networks Using Efficient Query Mechanisms , 2018, ECCV.

[23] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.

[24] J. Zico Kolter,et al. Fast is better than free: Revisiting adversarial training , 2020, ICLR.

[25] Mislav Balunovic,et al. Adversarial Training and Provable Defenses: Bridging the Gap , 2020, ICLR.

[26] Tongliang Liu,et al. Maximum Mean Discrepancy is Aware of Adversarial Attacks , 2020, ArXiv.

[27] Ilya P. Razenshteyn,et al. Adversarial examples from computational constraints , 2018, ICML.

[28] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[29] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[30] V. Koltchinskii,et al. Empirical margin distributions and bounding the generalization error of combined classifiers , 2002, math/0405343.

[31] Debdeep Mukhopadhyay,et al. A survey on adversarial attacks and defences , 2021, CAAI Trans. Intell. Technol..

[32] Michael I. Jordan,et al. Theoretically Principled Trade-off between Robustness and Accuracy , 2019, ICML.

[33] Jinfeng Yi,et al. Query-Efficient Hard-label Black-box Attack: An Optimization-based Approach , 2018, ICLR.

[34] David A. Wagner,et al. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[35] Hang Su,et al. Bag of Tricks for Adversarial Training , 2020, ICLR.

[36] Seyed-Mohsen Moosavi-Dezfooli,et al. Robustness via Curvature Regularization, and Vice Versa , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Marcos M. López de Prado,et al. Advances in Financial Machine Learning: Numerai's Tournament (seminar slides) , 2018, SSRN Electronic Journal.

[38] Gang Niu,et al. Geometry-aware Instance-reweighted Adversarial Training , 2021, ICLR.

[39] Xiaowei Xu,et al. What Can Be Transferred: Unsupervised Domain Adaptation for Endoscopic Lesions Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40] Ludwig Schmidt,et al. Unlabeled Data Improves Adversarial Robustness , 2019, NeurIPS.

[41] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[42] Aleksander Madry,et al. Adversarially Robust Generalization Requires More Data , 2018, NeurIPS.

[43] Rongrong Ji,et al. Projection & Probability-Driven Black-Box Attack , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44] Chang Xu,et al. Adversarial Robustness through Disentangled Representations , 2021, AAAI.

[45] Gang Niu,et al. Learning Diverse-Structured Networks for Adversarial Robustness , 2021, ICML.

[46] Tong Zhang,et al. NATTACK: Learning the Distributions of Adversarial Examples for an Improved Black-Box Attack on Deep Neural Networks , 2019, ICML.

[47] Michael P. Wellman,et al. Towards the Science of Security and Privacy in Machine Learning , 2016, ArXiv.

[48] Colin Raffel,et al. Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition , 2019, ICML.

[49] Yisen Wang,et al. Adversarial Weight Perturbation Helps Robust Generalization , 2020, NeurIPS.

[50] Masashi Sugiyama,et al. Maximum Mean Discrepancy Test is Aware of Adversarial Attacks , 2020, ICML.

[51] Aleksander Madry,et al. On Evaluating Adversarial Robustness , 2019, ArXiv.

[52] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[54] James Bailey,et al. Improving Adversarial Robustness Requires Revisiting Misclassified Examples , 2020, ICLR.

[55] Samy Bengio,et al. Adversarial examples in the physical world , 2016, ICLR.

[56] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.

[57] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .