论文信息 - AdaDeep: A Usage-Driven, Automated Deep Model Compression Framework for Enabling Ubiquitous Intelligent Mobiles

AdaDeep: A Usage-Driven, Automated Deep Model Compression Framework for Enabling Ubiquitous Intelligent Mobiles

Recent breakthroughs in Deep Neural Networks (DNNs) have fueled a tremendously growing demand for bringing DNN-powered intelligence into mobile platforms. While the potential of deploying DNNs on resource-constrained platforms has been demonstrated by DNN compression techniques, the current practice suffers from two limitations: 1) merely stand-alone compression schemes are investigated even though each compression technique only suit for certain types of DNN layers; and 2) mostly compression techniques are optimized for DNNs' inference accuracy, without explicitly considering other application-driven system performance (e.g., latency and energy cost) and the varying resource availability across platforms (e.g., storage and processing capability). To this end, we propose AdaDeep, a usage-driven, automated DNN compression framework for systematically exploring the desired trade-off between performance and resource constraints, from a holistic system level. Specifically, in a layer-wise manner, AdaDeep automatically selects the most suitable combination of compression techniques and the corresponding compression hyperparameters for a given DNN. Thorough evaluations on six datasets and across twelve devices demonstrate that AdaDeep can achieve up to $18.6\times$ latency reduction, $9.8\times$ energy-efficiency improvement, and $37.3\times$ storage reduction in DNNs while incurring negligible accuracy loss. Furthermore, AdaDeep also uncovers multiple novel combinations of compression techniques.

[1] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2] Naresh R. Shanbhag,et al. Variation-Tolerant Architectures for Convolutional Neural Networks in the Near Threshold Voltage Regime , 2016, 2016 IEEE International Workshop on Signal Processing Systems (SiPS).

[3] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[4] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Hassan Foroosh,et al. Sparse Convolutional Neural Networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[7] Charbel Sakr,et al. Analytical Guarantees on Numerical Precision of Deep Neural Networks , 2017, ICML.

[8] Nicholas D. Lane,et al. Sparsification and Separation of Deep Learning Layers for Constrained Resource Inference on Wearables , 2016, SenSys.

[9] Nicholas D. Lane,et al. DeepEar: robust smartphone audio sensing in unconstrained acoustic environments using deep learning , 2015, UbiComp.

[10] Feng Qian,et al. DeepWear: Adaptive Local Offloading for On-Wearable Deep Learning , 2017, IEEE Transactions on Mobile Computing.

[11] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.

[12] Xu Chen,et al. Edge Intelligence: Paving the Last Mile of Artificial Intelligence With Edge Computing , 2019, Proceedings of the IEEE.

[13] Vivienne Sze,et al. Designing Energy-Efficient Convolutional Neural Networks Using Energy-Aware Pruning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[15] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[16] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .

[17] Yingyan Lin,et al. EnergyNet: Energy-Efficient Dynamic Inference , 2018 .

[18] 김종영. 구글 TensorFlow 소개 , 2015 .

[19] Dan Klein,et al. Deep Compositional Question Answering with Neural Module Networks , 2015, ArXiv.

[20] Satoshi Nakamura,et al. Compressing recurrent neural network with tensor train , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[21] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[22] Samy Bengio,et al. Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.

[23] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[24] Xiaofei Wang,et al. Convergence of Edge Computing and Deep Learning: A Comprehensive Survey , 2019, IEEE Communications Surveys & Tutorials.

[25] Sergey Levine,et al. Learning modular neural network policies for multi-task and multi-robot transfer , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[26] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[27] Mo Li,et al. iType: Using eye gaze to enhance typing privacy , 2017, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.

[28] Prabhat,et al. Scalable Bayesian Optimization Using Deep Neural Networks , 2015, ICML.

[29] Hui Liu,et al. On-Demand Deep Model Compression for Mobile Devices: A Usage-Driven Model Selection Framework , 2018, MobiSys.

[30] Yoshua Bengio,et al. Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[31] Aaron Klein,et al. Towards Automatically-Tuned Neural Networks , 2016, AutoML@ICML.

[32] Christos-Savvas Bouganis,et al. Latency-driven design for FPGA-based convolutional neural networks , 2017, 2017 27th International Conference on Field Programmable Logic and Applications (FPL).

[33] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[34] Charbel Sakr,et al. PredictiveNet: An energy-efficient convolutional neural network via zero prediction , 2017, 2017 IEEE International Symposium on Circuits and Systems (ISCAS).

[35] Mark Sandler,et al. The Power of Sparsity in Convolutional Neural Networks , 2017, ArXiv.

[36] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[37] Shimon Whiteson,et al. A theoretical and empirical analysis of Expected Sarsa , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.

[38] Song Han,et al. AMC: AutoML for Model Compression and Acceleration on Mobile Devices , 2018, ECCV.

[39] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[40] Joel Emer,et al. Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks , 2016, CARN.

[41] Aaron Klein,et al. Efficient and Robust Automated Machine Learning , 2015, NIPS.

[42] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[43] Ivan P. Gavrilyuk,et al. Lagrange multiplier approach to variational problems and applications , 2010, Math. Comput..

[44] Aaron Klein,et al. Bayesian Optimization with Robust Bayesian Neural Networks , 2016, NIPS.

[45] Cecilia Mascolo,et al. LEO: scheduling sensor inference algorithms across heterogeneous mobile processors and network resources , 2016, MobiCom.

[46] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[47] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[48] Mehryar Mohri,et al. AdaNet: Adaptive Structural Learning of Artificial Neural Networks , 2016, ICML.

[49] Vivienne Sze,et al. Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.

[50] Qiang Chen,et al. Network In Network , 2013, ICLR.

[51] Feng Qian,et al. Enabling Cooperative Inference of Deep Learning on Wearables and Smartphones , 2017, ArXiv.

[52] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[53] Jia Deng,et al. Dynamic Deep Neural Networks: Optimizing Accuracy-Efficiency Trade-offs by Selective Execution , 2017, AAAI.

[54] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.

[55] Yoshua Bengio,et al. Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[56] Nicholas D. Lane,et al. DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices , 2016, 2016 15th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN).

[57] Yunhao Liu,et al. Design and Implementation of a CSI-Based Ubiquitous Smoking Detection System , 2017, IEEE/ACM Transactions on Networking.

[58] Alec Wolman,et al. MCDNN: An Approximation-Based Execution Framework for Deep Stream Processing Under Resource Constraints , 2016, MobiSys.

[59] Yue Wang,et al. Deep k-Means: Re-Training and Parameter Sharing with Harder Cluster Assignments for Compressing Deep Convolutions , 2018, ICML.

[60] Lars Kotthoff,et al. Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA , 2017, J. Mach. Learn. Res..

[61] Frank Hutter,et al. Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves , 2015, IJCAI.