Adversarially Trained Model Compression: When Robustness Meets Efficiency

Deep model compression has been extensively studied, and state-of-the-art methods can now achieve high compression ratios with minimal accuracy loss. This paper studies model compression through a different lens: could we compress models without hurting their robustness to adversarial attacks, in addition to maintaining accuracy? Previous literature suggested that the goals of robustness and compactness might sometimes contradict. We propose a novel Adversarially Trained Model Compression (ATMC) framework. ATMC constructs a unified constrained optimization formulation, where existing compression means (pruning, factorization, quantization) are all integrated into the constraints. An efficient algorithm is then developed. An extensive group of experiments are presented, demonstrating that ATMC obtains remarkably more favorable trade-off among model size, accuracy and robustness, over currently available alternatives in various settings. The codes are publicly available at: https://github.com/shupenggui/ATMC.

[1]  Yue Wang,et al.  Deep k-Means: Re-Training and Parameter Sharing with Harder Cluster Assignments for Compressing Deep Convolutions , 2018, ICML.

[2]  Aleksander Madry,et al.  Robustness May Be at Odds with Accuracy , 2018, ICLR.

[3]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[4]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[6]  Shie Mannor,et al.  Sparse Algorithms Are Not Stable , 2012 .

[7]  Kamyar Azizzadenesheli,et al.  signSGD: compressed optimisation for non-convex problems , 2018, ICML.

[8]  Joan Bruna,et al.  Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.

[9]  Ji Liu,et al.  Learning Sparsity and Quantization Jointly and Automatically for Neural Network Compression via Constrained Optimization , 2019, ArXiv.

[10]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[11]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[12]  Logan Engstrom,et al.  Synthesizing Robust Adversarial Examples , 2017, ICML.

[13]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Radha Poovendran,et al.  Blocking Transferability of Adversarial Examples in Black-Box Learning Systems , 2017, ArXiv.

[15]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[16]  Changshui Zhang,et al.  Sparse DNNs with Improved Adversarial Robustness , 2018, NeurIPS.

[17]  Rina Panigrahy,et al.  Sparse Matrix Factorization , 2013, ArXiv.

[18]  Rohit Prabhavalkar,et al.  Compressing deep neural networks using a rank-constrained topology , 2015, INTERSPEECH.

[19]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[20]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[21]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[22]  Eugenio Culurciello,et al.  Flattened Convolutional Neural Networks for Feedforward Acceleration , 2014, ICLR.

[23]  Yiren Zhao,et al.  To compress or not to compress: Understanding the Interactions between Adversarial Attacks and Neural Network Compression , 2018, SysML.

[24]  Stephan Günnemann,et al.  Adversarial Attacks on Neural Networks for Graph Data , 2018, KDD.

[25]  John C. Duchi,et al.  Certifiable Distributional Robustness with Principled Adversarial Training , 2017, ArXiv.

[26]  Hassan Foroosh,et al.  Sparse Convolutional Neural Networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Qiang Xu,et al.  Towards Imperceptible and Robust Adversarial Example Attacks against Neural Networks , 2018, AAAI.

[28]  Ji Liu,et al.  ECC: Platform-Independent Energy-Constrained Deep Neural Network Compression via a Bilinear Regression Model , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[30]  Jian Cheng,et al.  Quantized Convolutional Neural Networks for Mobile Devices , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Tao Zhang,et al.  A Survey of Model Compression and Acceleration for Deep Neural Networks , 2017, ArXiv.

[32]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Ji Liu,et al.  Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and Layer Input Masking , 2018, ICLR.

[34]  Patrick D. McDaniel,et al.  Extending Defensive Distillation , 2017, ArXiv.

[35]  Jiayu Li,et al.  A Unified Framework of DNN Weight Pruning and Weight Clustering/Quantization Using ADMM , 2018, ArXiv.

[36]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[37]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[38]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[39]  Song Han,et al.  Defensive Quantization: When Efficiency Meets Robustness , 2019, ICLR.

[40]  Ran El-Yaniv,et al.  Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..

[41]  Yingyan Lin,et al.  EnergyNet: Energy-Efficient Dynamic Inference , 2018 .

[42]  Hao Zhou,et al.  Less Is More: Towards Compact CNNs , 2016, ECCV.

[43]  Preetum Nakkiran,et al.  Adversarial Robustness May Be at Odds With Simplicity , 2019, ArXiv.

[44]  Yue Wang,et al.  Dual Dynamic Inference: Enabling More Efficient, Adaptive, and Controllable Deep Inference , 2019, IEEE Journal of Selected Topics in Signal Processing.

[45]  Alan L. Yuille,et al.  Adversarial Examples for Semantic Segmentation and Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[46]  Yiran Chen,et al.  Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[47]  Hao Cheng,et al.  Adversarial Robustness vs. Model Compression, or Both? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[48]  Xiaolin Hu,et al.  Defense Against Adversarial Attacks Using High-Level Representation Guided Denoiser , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[49]  Dacheng Tao,et al.  Adversarial Learning of Portable Student Networks , 2018, AAAI.

[50]  R. Zemel A minimum description length framework for unsupervised learning , 1994 .

[51]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[52]  John C. Duchi,et al.  Certifying Some Distributional Robustness with Principled Adversarial Training , 2017, ICLR.

[53]  Xiaoning Qian,et al.  Collaborative Global-Local Networks for Memory-Efficient Segmentation of Ultra-High Resolution Images , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Yanjun Qi,et al.  Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks , 2017, NDSS.

[55]  Ming Yang,et al.  Compressing Deep Convolutional Networks using Vector Quantization , 2014, ArXiv.

[56]  Vincent Vanhoucke,et al.  Improving the speed of neural networks on CPUs , 2011 .

[57]  Hao Chen,et al.  MagNet: A Two-Pronged Defense against Adversarial Examples , 2017, CCS.

[58]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[59]  J. Danskin The Theory of Max-Min and its Application to Weapons Allocation Problems , 1967 .

[60]  Igor Carron,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016 .

[61]  Yoshua Bengio,et al.  BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[62]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Kamyar Azizzadenesheli,et al.  Stochastic Activation Pruning for Robust Adversarial Defense , 2018, ICLR.

[64]  Honglak Lee,et al.  An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[65]  Ivan V. Oseledets,et al.  Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition , 2014, ICLR.

[66]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[67]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[68]  Thomas Brox,et al.  Universal Adversarial Perturbations Against Semantic Image Segmentation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[69]  Hanan Samet,et al.  Pruning Filters for Efficient ConvNets , 2016, ICLR.

[70]  Xiaogang Wang,et al.  Convolutional neural networks with low-rank regularization , 2015, ICLR.

[71]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[72]  Dacheng Tao,et al.  On Compressing Deep Models by Low Rank and Sparse Decomposition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[73]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.