RED++ : Data-Free Pruning of Deep Neural Networks via Input Splitting and Output Merging

Pruning Deep Neural Networks (DNNs) is a prominent field of study in the goal of inference runtime acceleration. In this paper, we introduce a novel data-free pruning protocol RED++. Only requiring a trained neural network, and not specific to DNN architecture, we exploit an adaptive data-free scalar hashing which exhibits redundancies among neuron weight values. We study the theoretical and empirical guarantees on the preservation of the accuracy from the hashing as well as the expected pruning ratio resulting from the exploitation of said redundancies. We propose a novel data-free pruning technique of DNN layers which removes the input-wise redundant operations. This algorithm is straightforward, parallelizable and offers novel perspective on DNN pruning by shifting the burden of large computation to efficient memory access and allocation. We provide theoretical guarantees on RED++ performance and empirically demonstrate its superiority over other data-free pruning methods and its competitiveness with data-driven ones on ResNets, MobileNets and EfficientNets.

[1]  Dacheng Tao,et al.  SCOP: Scientific Control for Reliable Neural Network Pruning , 2020, NeurIPS.

[2]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[3]  Michael Carbin,et al.  The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.

[4]  Matthijs Douze,et al.  LeViT: a Vision Transformer in ConvNet’s Clothing for Faster Inference , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  Rémi Gribonval,et al.  And the Bit Goes Down: Revisiting the Quantization of Neural Networks , 2019, ICLR.

[6]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Hanan Samet,et al.  Pruning Filters for Efficient ConvNets , 2016, ICLR.

[8]  Daniel L. K. Yamins,et al.  Pruning neural networks without any data by iteratively conserving synaptic flow , 2020, NeurIPS.

[9]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[10]  Yi Yang,et al.  Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks , 2018, IJCAI.

[11]  Ali Farhadi,et al.  Layer-Wise Data-Free CNN Compression , 2020, 2022 26th International Conference on Pattern Recognition (ICPR).

[12]  Dacheng Tao,et al.  Manifold Regularized Dynamic Network Pruning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[14]  Jinwoo Shin,et al.  Lookahead: a Far-Sighted Alternative of Magnitude-based Pruning , 2020, ICLR.

[15]  Michael Carbin,et al.  Comparing Rewinding and Fine-tuning in Neural Network Pruning , 2019, ICLR.

[16]  Philip H. S. Torr,et al.  A Signal Propagation Perspective for Pruning Neural Networks at Initialization , 2019, ICLR.

[17]  Daniel Barsky,et al.  Nombres de Bell et somme de factorielles , 2004 .

[18]  F. Porikli,et al.  Structured Convolutions for Efficient Neural Network Design , 2020, NeurIPS.

[19]  Suhyun Kim,et al.  Neuron Merging: Compensating for Pruned Neurons , 2020, NeurIPS.

[20]  Jianxin Wu,et al.  ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Dong Xu,et al.  Multi-Dimensional Pruning: A Unified Framework for Model Compression , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  R. Venkatesh Babu,et al.  Data-free Parameter Pruning for Deep Neural Networks , 2015, BMVC.

[23]  S. Basu,et al.  The Mean, Median, and Mode of Unimodal Distributions:A Characterization , 1997 .

[24]  Derek Hoiem,et al.  Dreaming to Distill: Data-Free Knowledge Transfer via DeepInversion , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Markus Nagel,et al.  Data-Free Quantization Through Weight Equalization and Bias Correction , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[26]  Deng Cai,et al.  Accelerate Your CNN from Three Dimensions: A Comprehensive Pruning Framework , 2020, ArXiv.

[27]  Dan Feldman,et al.  Provable Filter Pruning for Efficient Neural Networks , 2019, ICLR.

[28]  Jose Javier Gonzalez Ortiz,et al.  What is the State of Neural Network Pruning? , 2020, MLSys.

[29]  Hao Cheng,et al.  Pruning Filter in Filter , 2020, NeurIPS.

[30]  Zhijian Liu,et al.  HAQ: Hardware-Aware Automated Quantization With Mixed Precision , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[32]  Adam R. Klivans,et al.  Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection , 2020, ICML.

[33]  Mingjie Sun,et al.  Rethinking the Value of Network Pruning , 2018, ICLR.

[34]  Alexander Finkelstein,et al.  Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization , 2019, ICML.

[35]  Mingkui Tan,et al.  Discrimination-aware Network Pruning for Deep Model Compression , 2020, ArXiv.

[36]  Zhiru Zhang,et al.  Improving Neural Network Quantization without Retraining using Outlier Channel Splitting , 2019, ICML.

[37]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Matthieu Cord,et al.  Training data-efficient image transformers & distillation through attention , 2020, ICML.

[39]  Matthieu Cord,et al.  RED : Looking for Redundancies for Data-Free Structured Compression of Deep Neural Networks , 2021, NeurIPS.

[40]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[41]  Yury Gorbachev,et al.  OpenVINO Deep Learning Workbench: Comprehensive Analysis and Tuning of Neural Networks Inference , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[42]  Tao Zhang,et al.  A Survey of Model Compression and Acceleration for Deep Neural Networks , 2017, ArXiv.

[43]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[44]  Yuheng Huang,et al.  Neuron-level Structured Pruning using Polarization Regularizer , 2020, NeurIPS.

[45]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[46]  Wu Liu,et al.  Deep learning hashing for mobile visual search , 2017, EURASIP J. Image Video Process..

[47]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Raghuraman Krishnamoorthi,et al.  Quantizing deep convolutional networks for efficient inference: A whitepaper , 2018, ArXiv.