EasiEdge: A Novel Global Deep Neural Networks Pruning Method for Efficient Edge Computing

Deep neural networks (DNNs) have shown tremendous success in many areas, such as signal processing, computer vision, and artificial intelligence. However, the DNNs require intensive computation resources, hindering their practical applications on the edge devices with limited storage and computation resources. Filter pruning has been recognized as a useful technique to compress and accelerate the DNNs, but most existing works tend to prune filters in a layerwise manner, facing some significant drawbacks. First, the layerwise pruning methods require prohibitive computation for per-layer sensitivity analysis. Second, layerwise pruning suffers from the accumulation of pruning errors, leading to performance degradation of pruned networks. To address these challenges, we propose a novel global pruning method, namely, EasiEdge, to compress and accelerate the DNNs for efficient edge computing. More specifically, we introduce an alternating direction method of multipliers (ADMMs) to formulate the pruning problem as a performance improving subproblem and a global pruning subproblem. In the global pruning subproblem, we propose to use information gain (IG) to quantify the impact of filters removal on the class probability distributions of network output. Besides, we propose a Taylor-based approximate algorithm (TBAA) to efficiently calculate the IG of filters. Extensive experiments on three data sets and two edge computing platforms verify that our proposed EasiEdge can efficiently accelerate DNNs on edge computing platforms with nearly negligible accuracy loss. For example, when EasiEdge prunes 80% filters in VGG-16, the accuracy drops by 0.22%, but inference latency on CPU of Jetson TX2 decreases from 76.85 to 8.01 ms.

[1]  Yanzhi Wang,et al.  A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers , 2018, ECCV.

[2]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[3]  Sheng Tang,et al.  Auto-Balanced Filter Pruning for Efficient Convolutional Neural Networks , 2018, AAAI.

[4]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[5]  Yi Yang,et al.  Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks , 2018, IJCAI.

[6]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Marimuthu Palaniswami,et al.  Internet of Things (IoT): A vision, architectural elements, and future directions , 2012, Future Gener. Comput. Syst..

[10]  Stephen P. Boyd,et al.  A simple effective heuristic for embedded mixed-integer quadratic programming , 2015, 2016 American Control Conference (ACC).

[11]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[12]  Shenghuo Zhu,et al.  Extremely Low Bit Neural Network: Squeeze the Last Bit Out with ADMM , 2017, AAAI.

[13]  Rongrong Ji,et al.  HRank: Filter Pruning Using High-Rank Feature Map , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Yuandong Tian,et al.  FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[16]  Xinyu Yang,et al.  A Survey on the Edge Computing for the Internet of Things , 2018, IEEE Access.

[17]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[18]  Yurong Chen,et al.  Dynamic Network Surgery for Efficient DNNs , 2016, NIPS.

[19]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[20]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jungong Han,et al.  Approximated Oracle Filter Pruning for Destructive CNN Width Optimization , 2019, ICML.

[22]  Liujuan Cao,et al.  Towards Optimal Structured CNN Pruning via Generative Adversarial Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[24]  Xiangyu Zhang,et al.  Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[25]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[26]  Jianli Pan,et al.  Future Edge Cloud and Edge Computing for Internet of Things Applications , 2018, IEEE Internet of Things Journal.

[27]  Bingbing Ni,et al.  Variational Convolutional Neural Network Pruning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Song Han,et al.  AMC: AutoML for Model Compression and Acceleration on Mobile Devices , 2018, ECCV.

[29]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[30]  Jianxin Wu,et al.  ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[31]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[32]  Zhiqiang Shen,et al.  Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[33]  Timo Aila,et al.  Pruning Convolutional Neural Networks for Resource Efficient Transfer Learning , 2016, ArXiv.

[34]  Yiran Chen,et al.  Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[35]  Xiangyu Zhang,et al.  Single Path One-Shot Neural Architecture Search with Uniform Sampling , 2019, ECCV.

[36]  Elisa Bertino,et al.  Pruning deep convolutional neural networks for efficient edge computing in condition assessment of infrastructures , 2019, Comput. Aided Civ. Infrastructure Eng..

[37]  Larry S. Davis,et al.  NISP: Pruning Networks Using Neuron Importance Score Propagation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[39]  Hanan Samet,et al.  Pruning Filters for Efficient ConvNets , 2016, ICLR.

[40]  Ping Liu,et al.  Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Rui Peng,et al.  Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures , 2016, ArXiv.

[42]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[43]  Li Cui,et al.  Tutor-Instructing Global Pruning for Accelerating Convolutional Neural Networks , 2020, ECAI.

[44]  Tarik Taleb,et al.  Edge Computing for the Internet of Things: A Case Study , 2018, IEEE Internet of Things Journal.

[45]  Yan Zhang,et al.  Mobile Edge Computing: A Survey , 2018, IEEE Internet of Things Journal.

[46]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[47]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, ArXiv.

[48]  Guang Yang,et al.  SaliencyGAN: Deep Learning Semisupervised Salient Object Detection in the Fog of IoT , 2020, IEEE Transactions on Industrial Informatics.