Adversarial Fine-tuning of Compressed Neural Networks for Joint Improvement of Robustness and Efficiency

As deep learning (DL) models are increasingly being integrated into our everyday lives, ensuring their safety by making them robust against adversarial attacks has become increasingly critical. DL models have been found to be susceptible to adversarial attacks which can be achieved by introducing small, targeted perturbations to disrupt the input data. Adversarial training has been presented as a mitigation strategy which can result in more robust models. This adversarial robustness comes with additional computational costs required to design adversarial attacks during training. The two objectives -- adversarial robustness and computational efficiency -- then appear to be in conflict of each other. In this work, we explore the effects of two different model compression methods -- structured weight pruning and quantization -- on adversarial robustness. We specifically explore the effects of fine-tuning on compressed models, and present the trade-off between standard fine-tuning and adversarial fine-tuning. Our results show that compression does not inherently lead to loss in model robustness and adversarial fine-tuning of a compressed model can yield large improvement to the robustness performance of models. We present experiments on two benchmark datasets showing that adversarial fine-tuning of compressed models can achieve robustness performance comparable to adversarially trained models, while also improving computational efficiency.

[1]  Raghavendra Selvan,et al.  Activation Compression of Graph Neural Networks using Block-wise Quantization with Improved Variance Minimization , 2023, ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  F R HENDRICKSON,et al.  Which Is Better? , 2023, Advances in Cosmetic Surgery.

[3]  Davis W. Blalock,et al.  Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities , 2022, J. Mach. Learn. Res..

[4]  Jennifer G. Dy,et al.  Pruning Adversarially Robust Neural Networks without Adversarial Examples , 2022, 2022 IEEE International Conference on Data Mining (ICDM).

[5]  Qun Liu,et al.  Exploring Extreme Parameter Compression for Pre-trained Language Models , 2022, ICLR.

[6]  P. Frossard,et al.  On the benefits of knowledge distillation for adversarial robustness , 2022, ArXiv.

[7]  T. Besiroglu,et al.  Compute Trends Across Three Eras of Machine Learning , 2022, 2022 International Joint Conference on Neural Networks (IJCNN).

[8]  Hatice Gunes,et al.  The Effect of Model Compression on Fairness in Facial Expression Recognition , 2022, ICPR Workshops.

[9]  Jinfeng Yi,et al.  How and When Adversarial Robustness Transfers in Knowledge Distillation? , 2021, ArXiv.

[10]  M. Lewis,et al.  8-bit Optimizers via Block-wise Quantization , 2021, ICLR.

[11]  Helio Pedrini,et al.  On the Effect of Pruning on Adversarial Robustness , 2021, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).

[12]  Rana Ali Amjad,et al.  A White Paper on Neural Network Quantization , 2021, ArXiv.

[13]  Bo Yuan,et al.  Towards Efficient Tensor Decomposition-Based DNN Model Compression with Optimization Framework , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Taesup Moon,et al.  Fair Feature Distillation for Visual Recognition , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Dan Alistarh,et al.  Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks , 2021, J. Mach. Learn. Res..

[16]  Mohammad Javad Shafiee,et al.  A Simple Fine-tuning Is All You Need: Towards Robust Deep Learning Via Adversarial Fine-tuning , 2020, ArXiv.

[17]  Nicolas Flammarion,et al.  RobustBench: a standardized adversarial robustness benchmark , 2020, NeurIPS Datasets and Benchmarks.

[18]  Emily Denton,et al.  Characterising Bias in Compressed Models , 2020, ArXiv.

[19]  Matthias Hein,et al.  Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks , 2020, ICML.

[20]  S. Jana,et al.  HYDRA: Pruning Adversarially Robust Neural Networks , 2020, NeurIPS.

[21]  J. Zico Kolter,et al.  Fast is better than free: Revisiting adversarial training , 2020, ICLR.

[22]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[23]  Thomas Wolf,et al.  DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.

[24]  Andrew McCallum,et al.  Energy and Policy Considerations for Deep Learning in NLP , 2019, ACL.

[25]  Micah Goldblum,et al.  Adversarially Robust Distillation , 2019, AAAI.

[26]  Larry S. Davis,et al.  Adversarial Training for Free! , 2019, NeurIPS.

[27]  Hao Cheng,et al.  Adversarial Robustness vs. Model Compression, or Both? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Ji Liu,et al.  Model Compression with Adversarial Robustness: A Unified Optimization Framework , 2019, NeurIPS.

[29]  Michael Carbin,et al.  The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.

[30]  Bo Chen,et al.  Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  T. Han,et al.  Learning Efficient Object Detection Models with Knowledge Distillation , 2017, NIPS.

[32]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[33]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[34]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[35]  Min Wu,et al.  Safety Verification of Deep Neural Networks , 2016, CAV.

[36]  Ran El-Yaniv,et al.  Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..

[37]  Hanan Samet,et al.  Pruning Filters for Efficient ConvNets , 2016, ICLR.

[38]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Alexander Novikov,et al.  Tensorizing Neural Networks , 2015, NIPS.

[41]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[42]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[43]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[44]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[45]  Ivan Oseledets,et al.  Tensor-Train Decomposition , 2011, SIAM J. Sci. Comput..

[46]  Ole Winther,et al.  A Quantitative Study Of Pruning By Optimal Brain Damage , 1993, Int. J. Neural Syst..

[47]  Sunayana Sitaram,et al.  A Comparative Study on the Impact of Model Compression Techniques on Fairness in Language Models , 2023, ACL.

[48]  Junyi Chai,et al.  Fairness without Demographics through Knowledge Distillation , 2022, NeurIPS.

[49]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, ArXiv.

[50]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[51]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.