论文信息 - Fully Dynamic Inference With Deep Neural Networks

Fully Dynamic Inference With Deep Neural Networks

Modern deep neural networks are powerful and widely applicable models that extract task-relevant information through multi-level abstraction. Their cross-domain success, however, is often achieved at the expense of computational cost, high memory bandwidth, and long inference latency, which prevents their deployment in resource-constrained and time-sensitive scenarios, such as edge-side inference and self-driving cars. While recently developed methods for creating efficient deep neural networks are making their real-world deployment more feasible by reducing model size, they do not fully exploit input properties on a per-instance basis to maximize computational efficiency and task accuracy. In particular, most existing methods typically use a one-size-fits-all approach that identically processes all inputs. Motivated by the fact that different images require different feature embeddings to be accurately classified, we propose a fully dynamic paradigm that imparts deep convolutional neural networks with hierarchical inference dynamics at the level of layers and individual convolutional filters/channels. Two compact networks, called Layer-Net (L-Net) and Channel-Net (C-Net), predict on a per-instance basis which layers or filters/channels are redundant and therefore should be skipped. L-Net and C-Net also learn how to scale retained computation outputs to maximize task accuracy. By integrating L-Net and C-Net into a joint design framework, called LC-Net, we consistently outperform state-of-the-art dynamic frameworks with respect to both efficiency and classification accuracy. On the CIFAR-10 dataset, LC-Net results in up to 11.9$\times$ fewer floating-point operations (FLOPs) and up to 3.3% higher accuracy compared to other dynamic inference methods. On the ImageNet dataset, LC-Net achieves up to 1.4$\times$ fewer FLOPs and up to 4.6% higher Top-1 accuracy than the other methods.

[1] Niraj K. Jha,et al. DiabDeep: Pervasive Diabetes Diagnosis Based on Wearable Medical Sensors and Efficient Neural Networks , 2019, IEEE Transactions on Emerging Topics in Computing.

[2] Venkatesh Saligrama,et al. Adaptive Neural Networks for Efficient Inference , 2017, ICML.

[3] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[4] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[5] Song Han,et al. Trained Ternary Quantization , 2016, ICLR.

[6] Xiangyu Zhang,et al. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.

[7] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .

[8] H. T. Kung,et al. BranchyNet: Fast inference via early exiting from deep neural networks , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[9] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10] Zhuowen Tu,et al. Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Kurt Keutzer,et al. Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13] Augustus Odena,et al. Changing Model Behavior at Test-Time Using Reinforcement Learning , 2017, ICLR.

[14] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[15] Larry S. Davis,et al. NISP: Pruning Networks Using Neuron Importance Score Propagation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[18] Hanan Samet,et al. Pruning Filters for Efficient ConvNets , 2016, ICLR.

[19] Bo Chen,et al. MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[21] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.

[22] Niraj K. Jha,et al. NeST: A Neural Network Synthesis Tool Based on a Grow-and-Prune Paradigm , 2017, IEEE Transactions on Computers.

[23] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[24] Louis B. Rall,et al. Automatic differentiation , 1981 .

[25] Pavlo Molchanov,et al. IamNN: Iterative and Adaptive Mobile Neural Network for Efficient Image Classification , 2018, ICLR.

[26] Niraj K. Jha,et al. ChamNet: Towards Efficient Network Design Through Platform-Aware Model Adaptation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[28] Yu Cao,et al. Efficient Network Construction Through Structural Plasticity , 2019, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[29] Junmo Kim,et al. A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Niraj K. Jha,et al. Dreaming to Distill: Data-Free Knowledge Transfer via DeepInversion , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Xiangyu Zhang,et al. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32] Alexander Wong,et al. Dynamic Representations Toward Efficient Inference on Deep Neural Networks by Decision Gates , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[33] Larry S. Davis,et al. BlockDrop: Dynamic Inference Paths in Residual Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[35] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[36] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[37] Cheng-Zhong Xu,et al. Dynamic Channel Pruning: Feature Boosting and Suppression , 2018, ICLR.

[38] Yoshua Bengio,et al. FitNets: Hints for Thin Deep Nets , 2014, ICLR.

[39] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[40] Anastasios Tefas,et al. Learning Deep Representations with Probabilistic Knowledge Transfer , 2018, ECCV.

[41] Kurt Keutzer,et al. ZeroQ: A Novel Zero Shot Quantization Framework , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43] Niraj K. Jha,et al. SCANN: Synthesis of Compact and Accurate Neural Networks , 2019, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[44] Jia Deng,et al. Dynamic Deep Neural Networks: Optimizing Accuracy-Efficiency Trade-offs by Selective Execution , 2017, AAAI.

[45] Yiran Chen,et al. Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[46] Ramesh Raskar,et al. Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.