Adversarial Neural Pruning

It is well known that neural networks are susceptible to adversarial perturbations and are also computationally and memory intensive which makes it difficult to deploy them in real-world applications where security and computation are constrained. In this work, we aim to obtain both robust and sparse networks that are applicable to such scenarios, based on the intuition that latent features have a varying degree of susceptibility to adversarial perturbations. Specifically, we define vulnerability at the latent feature space and then propose a Bayesian framework to prioritize features based on their contribution to both the original and adversarial loss, to prune vulnerable features and preserve the robust ones. Through quantitative evaluation and qualitative analysis of the perturbation to latent features, we show that our sparsification method is a defense mechanism against adversarial attacks and the robustness indeed comes from our model's ability to prune vulnerable latent features that are more susceptible to adversarial perturbations.

[1]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Pusheng Zhang,et al.  Scaling Machine Learning as a Service , 2017, PAPIs.

[3]  Stefanos Zafeiriou,et al.  ArcFace: Additive Angular Margin Loss for Deep Face Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[5]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[6]  Kimin Lee,et al.  Using Pre-Training Can Improve Model Robustness and Uncertainty , 2019, ICML.

[7]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[8]  Vatsal Sharan,et al.  A Spectral View of Adversarially Robust Features , 2018, NeurIPS.

[9]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[10]  P. Kumaraswamy A generalized probability density function for double-bounded random processes , 1980 .

[11]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12]  Aleksander Madry,et al.  Adversarially Robust Generalization Requires More Data , 2018, NeurIPS.

[13]  J. Zico Kolter,et al.  Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[14]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[15]  Colin Raffel,et al.  Thermometer Encoding: One Hot Way To Resist Adversarial Examples , 2018, ICLR.

[16]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[17]  Yiran Chen,et al.  Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[18]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[20]  Nicholas Carlini,et al.  On the Robustness of the CVPR 2018 White-Box Adversarial Example Defenses , 2018, ArXiv.

[21]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[22]  Beomsu Kim,et al.  Bridging Adversarial Robustness and Gradient Interpretability , 2019, ArXiv.

[23]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[24]  Yang Song,et al.  PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples , 2017, ICLR.

[25]  Xiao Wang,et al.  Defending DNN Adversarial Attacks with Pruning and Logits Augmentation , 2018, 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[26]  Kamyar Azizzadenesheli,et al.  Stochastic Activation Pruning for Robust Adversarial Defense , 2018, ICLR.

[27]  Naftali Tishby,et al.  The information bottleneck method , 2000, ArXiv.

[28]  Ariel D. Procaccia,et al.  Variational Dropout and the Local Reparameterization Trick , 2015, NIPS.

[29]  Pushmeet Kohli,et al.  Adversarial Risk and the Dangers of Evaluating Against Weak Attacks , 2018, ICML.

[30]  Yanjun Qi,et al.  Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks , 2017, NDSS.

[31]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[32]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[33]  David P. Wipf,et al.  Compressing Neural Networks using the Variational Information Bottleneck , 2018, ICML.

[34]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[35]  Ananthram Swami,et al.  Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples , 2016, ArXiv.

[36]  Logan Engstrom,et al.  Evaluating and Understanding the Robustness of Adversarial Logit Pairing , 2018, ArXiv.

[37]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[38]  Michael I. Jordan,et al.  Theoretically Principled Trade-off between Robustness and Accuracy , 2019, ICML.

[39]  Pushmeet Kohli,et al.  Memory Bounded Deep Convolutional Networks , 2014, ArXiv.

[40]  Dmitry P. Vetrov,et al.  Variational Dropout Sparsifies Deep Neural Networks , 2017, ICML.

[41]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[42]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[43]  Nikko Ström,et al.  Sparse connection and pruning in large dynamic artificial neural networks , 1997, EUROSPEECH.

[44]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[45]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[46]  Luyu Wang,et al.  Adversarial Robustness of Pruned Neural Networks , 2018 .

[47]  Alan L. Yuille,et al.  Mitigating adversarial effects through randomization , 2017, ICLR.

[48]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[49]  Aleksander Madry,et al.  Robustness May Be at Odds with Accuracy , 2018, ICLR.

[50]  Beilun Wang,et al.  DeepMask: Masking DNN Models for robustness against adversarial samples , 2017, ArXiv.

[51]  Dmitry P. Vetrov,et al.  Structured Bayesian Pruning via Log-Normal Multiplicative Noise , 2017, NIPS.

[52]  Cho-Jui Hsieh,et al.  Adv-BNN: Improved Adversarial Defense through Robust Bayesian Neural Network , 2018, ICLR.

[53]  Matthias Bethge,et al.  Towards the first adversarially robust neural network model on MNIST , 2018, ICLR.

[54]  Andrew Y. Ng,et al.  CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning , 2017, ArXiv.

[55]  Changshui Zhang,et al.  Sparse DNNs with Improved Adversarial Robustness , 2018, NeurIPS.

[56]  Marco Cote STICK-BREAKING VARIATIONAL AUTOENCODERS , 2017 .