Understanding the effect of sparsity on neural networks robustness

This paper examines the impact of static sparsity on the robustness of a trained network to weight perturbations, data corruption, and adversarial examples. We show that, up to a certain sparsity achieved by increasing network width and depth while keeping the network capacity fixed, sparsified networks consistently match and often outperform their initially dense versions. Robustness and accuracy decline simultaneously for very high sparsity due to loose connectivity between network layers. Our findings show that a rapid robustness drop caused by network compression observed in the literature is due to a reduced network capacity rather than sparsity. after training, network sparsification does not hurt or even improves robustness to a certain sparsity compared to a dense network of the same capacity. Robustness and accuracy decline simultaneously for very high sparsity due to loose connectivity between network layers. We show that our hypothesis holds when introducing sparsity by increasing network width and depth in separate experiments, applied before and after training. These findings show that a rapid robustness drop caused by network compression observed in the literature is due to a reduced network capacity rather than sparsity.

[1]  John C. Duchi,et al.  Distributionally Robust Losses for Latent Covariate Mixtures , 2020, Oper. Res..

[2]  Vincent Gripon,et al.  Quantization and Deployment of Deep Neural Networks on Microcontrollers , 2021, Sensors.

[3]  Dan Alistarh,et al.  Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks , 2021, J. Mach. Learn. Res..

[4]  Behnam Neyshabur,et al.  Are wider nets better given the same number of parameters? , 2020, ICLR.

[5]  Benjamin F. Grewe,et al.  Neural networks with late-phase weights , 2020, ICLR.

[6]  Matteo Negri,et al.  Entropic gradient descent algorithms and wide flat minima , 2020, ArXiv.

[7]  Hoki Kim Torchattacks : A Pytorch Repository for Adversarial Attacks , 2020, ArXiv.

[8]  Sam Ritchie,et al.  Caliban: Docker-based job manager for reproducible workflows , 2020, J. Open Source Softw..

[9]  Behnam Neyshabur,et al.  Towards Learning Convolutions from Scratch , 2020, NeurIPS.

[10]  Benjamin Recht,et al.  Evaluating Machine Accuracy on ImageNet , 2020, ICML.

[11]  Jose Javier Gonzalez Ortiz,et al.  What is the State of Neural Network Pruning? , 2020, MLSys.

[12]  Matthias Hein,et al.  Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks , 2020, ICML.

[13]  Hossein Mobahi,et al.  Fantastic Generalization Measures and Where to Find Them , 2019, ICLR.

[14]  Aaron C. Courville,et al.  What Do Compressed Deep Neural Networks Forget , 2019, 1911.05248.

[15]  Michael Carbin,et al.  Comparing Rewinding and Fine-tuning in Neural Network Pruning , 2019, ICLR.

[16]  Tatsunori B. Hashimoto,et al.  Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization , 2019, ArXiv.

[17]  Class-dependent Compression of Deep Neural Networks , 2019, ArXiv.

[18]  Justin Gilmer,et al.  MNIST-C: A Robustness Benchmark for Computer Vision , 2019, ArXiv.

[19]  Erich Elsen,et al.  The State of Sparsity in Deep Neural Networks , 2019, ArXiv.

[20]  Benjamin Recht,et al.  Do ImageNet Classifiers Generalize to ImageNet? , 2019, ICML.

[21]  Thomas G. Dietterich,et al.  Benchmarking Neural Network Robustness to Common Corruptions and Perturbations , 2018, ICLR.

[22]  Michael Carbin,et al.  The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.

[23]  Fabio Roli,et al.  Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning , 2017, Pattern Recognit..

[24]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[25]  Frank Hutter,et al.  SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.

[26]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[27]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[29]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[30]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.