Learning Sample Reweighting for Accuracy and Adversarial Robustness

There has been great interest in enhancing the robustness of neural network classifiers to defend against adversarial perturbations through adversarial training, while balancing the trade-off between robust accuracy and standard accuracy. We propose a novel adversarial training framework that learns to reweight the loss associated with individual training samples based on a notion of class-conditioned margin, with the goal of improving robust generalization. We formulate weighted adversarial training as a bilevel optimization problem with the upper-level problem corresponding to learning a robust classifier, and the lower-level problem corresponding to learning a parametric function that maps from a sample’s multi-class margin to an importance weight. Extensive experiments demonstrate that our approach consistently improves both clean and robust accuracy compared to related methods and state-of-the-art baselines.

[1]  Sven Gowal,et al.  Data Augmentation Can Improve Robustness , 2021, NeurIPS.

[2]  Sven Gowal,et al.  Improving Robustness using Generated Data , 2021, NeurIPS.

[3]  Masashi Sugiyama,et al.  Probabilistic Margins for Instance Reweighting in Adversarial Training , 2021, NeurIPS.

[4]  Yuting Ye,et al.  Understanding the role of importance weighting for deep learning , 2021, ICLR.

[5]  Qun Liu,et al.  Reweighting Augmented Samples by Minimizing the Maximal Expected Loss , 2021, ICLR.

[6]  Tom Goldstein,et al.  Are Adversarial Examples Created Equal? A Learnable Weighted Minimax Risk for Robustness under Non-uniform Attacks , 2020, AAAI.

[7]  Masashi Sugiyama,et al.  Geometry-aware Instance-reweighted Adversarial Training , 2020, ICLR.

[8]  Varun Kanade,et al.  How benign is benign overfitting? , 2020, ICLR.

[9]  Masashi Sugiyama,et al.  Calibrated Surrogate Losses for Adversarially Robust Classification , 2020, COLT.

[10]  Cyrus Rashtchian,et al.  Adversarial Robustness Through Local Lipschitzness , 2020, ArXiv.

[11]  Cyrus Rashtchian,et al.  A Closer Look at Accuracy vs. Robustness , 2020, NeurIPS.

[12]  Matthias Hein,et al.  Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks , 2020, ICML.

[13]  Mohan S. Kankanhalli,et al.  Attacks Which Do Not Kill Training Make Adversarial Learning Stronger , 2020, ICML.

[14]  J. Z. Kolter,et al.  Overfitting in adversarially robust deep learning , 2020, ICML.

[15]  J. Zico Kolter,et al.  Fast is better than free: Revisiting adversarial training , 2020, ICLR.

[16]  Tom Goldstein,et al.  Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets , 2019, ArXiv.

[17]  Sergey Levine,et al.  Meta-Learning with Implicit Gradients , 2019, NeurIPS.

[18]  Nicolas M. Müller,et al.  Identifying Mislabeled Instances in Classification Datasets , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[19]  Greg Yang,et al.  Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers , 2019, NeurIPS.

[20]  Qi Xie,et al.  Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting , 2019, NeurIPS.

[21]  Michael I. Jordan,et al.  Theoretically Principled Trade-off between Robustness and Accuracy , 2019, ICML.

[22]  Ruitong Huang,et al.  Max-Margin Adversarial (MMA) Training: Direct Input Space Margin Maximization through Adversarial Training , 2018, ICLR.

[23]  Massoud Pedram,et al.  Gradient Agreement as an Optimization Objective for Meta-Learning , 2018, ArXiv.

[24]  Paolo Favaro,et al.  Deep Bilevel Learning , 2018, ECCV.

[25]  Razvan Pascanu,et al.  Meta-Learning with Latent Embedding Optimization , 2018, ICLR.

[26]  Bin Yang,et al.  Learning to Reweight Examples for Robust Deep Learning , 2018, ICML.

[27]  J. Zico Kolter,et al.  Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[28]  Kouichi Sakurai,et al.  One Pixel Attack for Fooling Deep Neural Networks , 2017, IEEE Transactions on Evolutionary Computation.

[29]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[30]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[31]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[32]  Christian Gagné,et al.  Robustness to Adversarial Examples through an Ensemble of Specialists , 2017, ICLR.

[33]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[34]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[35]  Mehryar Mohri,et al.  Multi-Class Classification with Maximum Margin Multiple Kernel , 2013, ICML.

[36]  Jerome Bracken,et al.  Mathematical Programs with Optimization Problems in the Constraints , 1973, Oper. Res..

[37]  Shiyu Chang,et al.  Robust Overfitting may be mitigated by properly learned smoothening , 2021, ICLR.

[38]  Nuno Vasconcelos,et al.  Multiclass Boosting: Margins, Codewords, Losses, and Algorithms , 2019, J. Mach. Learn. Res..

[39]  Matthias Hein,et al.  Provable robustness against all adversarial lp-perturbations for p≥1 , 2019, ArXiv.

[40]  H. Zou The Margin Vector , Admissible Loss and Multi-class Margin-based Classifiers , 2005 .