Auto Graph Encoder-Decoder for Model Compression and Network Acceleration

Model compression aims to deploy deep neural networks (DNN) to mobile devices with limited computing power and storage resource. However, most of the existing model compression methods rely on manually defined rules, which requires domain expertise. In this paper, we propose an Auto Graph encoder-decoder Model Compression (AGMC) method combined with graph neural networks (GNN) and reinforcement learning (RL) to find the best compression policy. We model the target DNN as a graph and use GNN to learn the embeddings of the DNN automatically. In our experiments, we first compared our method with rule-based DNN embedding methods to show the graph auto encoder-decoder's effectiveness. Our learning-based DNN embedding achieved better performance and a higher compression ratio with fewer search steps. Moreover, we evaluated the AGMC on CIFAR-10 and ILSVRC-2012 datasets and compared handcrafted and learning-based model compression approaches. Our method outperformed handcrafted and learning-based methods on ResNet-56 with 3.6% and 1.8% higher accuracy, respectively. Furthermore, we achieved a higher compression ratio than state-of-the-art methods on MobileNet-V2 with just 0.93% accuracy loss.

[1]  Yanzhi Wang,et al.  A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers , 2018, ECCV.

[2]  Song Han,et al.  AMC: AutoML for Model Compression and Acceleration on Mobile Devices , 2018, ECCV.

[3]  Jiwen Lu,et al.  Runtime Neural Pruning , 2017, NIPS.

[4]  Sridhar Swaminathan,et al.  Sparse low rank factorization for deep neural network compression , 2020, Neurocomputing.

[5]  Yurong Chen,et al.  Dynamic Network Surgery for Efficient DNNs , 2016, NIPS.

[6]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[7]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[8]  Yan Lu,et al.  Relational Knowledge Distillation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[10]  Yixin Chen,et al.  Link Prediction Based on Graph Neural Networks , 2018, NeurIPS.

[11]  Zhengyang Wang,et al.  ChannelNets: Compact and Efficient Convolutional Neural Networks via Channel-Wise Convolutions , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[13]  Xiangyu Zhang,et al.  Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Ebru Arisoy,et al.  Low-rank matrix factorization for Deep Neural Network training with high-dimensional output targets , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[15]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[16]  Pavlo Molchanov,et al.  Importance Estimation for Neural Network Pruning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Roland Hu,et al.  Structured Probabilistic Pruning for Deep Convolutional Neural Network Acceleration , 2017, ArXiv.

[18]  Ping Liu,et al.  Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[20]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[21]  Huan Wang,et al.  Structured Probabilistic Pruning for Convolutional Neural Network Acceleration , 2017, BMVC.

[22]  Sanguthevar Rajasekaran,et al.  AutoPrune: Automatic Network Pruning by Regularizing Auxiliary Parameters , 2019, NeurIPS.

[23]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[24]  Oriol Vinyals,et al.  Hierarchical Representations for Efficient Architecture Search , 2017, ICLR.

[25]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[28]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[29]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[30]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[31]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Hanan Samet,et al.  Pruning Filters for Efficient ConvNets , 2016, ICLR.

[33]  Dan Alistarh,et al.  Model compression via distillation and quantization , 2018, ICLR.

[34]  Yi Yang,et al.  Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks , 2018, IJCAI.

[35]  Yiran Chen,et al.  ReCom: An efficient resistive accelerator for compressed deep neural networks , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[36]  Jing Liu,et al.  Discrimination-aware Channel Pruning for Deep Neural Networks , 2018, NeurIPS.

[37]  Bo Chen,et al.  Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Yan Yan,et al.  Is Pruning Compression?: Investigating Pruning Via Network Layer Similarity , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[39]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[40]  Jieping Ye,et al.  AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates , 2020, AAAI.

[41]  Jian Cheng,et al.  Bayesian Automatic Model Compression , 2020, IEEE Journal of Selected Topics in Signal Processing.

[42]  Jure Leskovec,et al.  Hierarchical Graph Representation Learning with Differentiable Pooling , 2018, NeurIPS.

[43]  Song Han,et al.  ADC: Automated Deep Compression and Acceleration with Reinforcement Learning , 2018, ArXiv.

[44]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[45]  Song Han,et al.  ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA , 2016, FPGA.