论文信息 - Differentiable Branching In Deep Networks for Fast Inference

Differentiable Branching In Deep Networks for Fast Inference

In this paper, we consider the design of deep neural networks augmented with multiple auxiliary classifiers departing from the main (backbone) network. These classifiers can be used to perform early-exit from the network at various layers, making them convenient for energy-constrained applications such as IoT, embedded devices, or Fog computing. However, designing an optimized early-exit strategy is a difficult task, generally requiring a large amount of manual fine-tuning. In this paper, we propose a way to jointly optimize this strategy together with the branches, providing an end-to-end trainable algorithm for this emerging class of neural networks. We achieve this by replacing the original output of the branches with a ‘soft’, differentiable approximation. In addition, we also propose a regularization approach to trade-off the computational efficiency of the early-exit strategy with respect to the overall classification accuracy. We evaluate our proposed design approach on a set of image classification benchmarks, showing significant gains in accuracy and inference time.

[1] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[2] Tsuyoshi Murata,et al. {m , 1934, ACML.

[3] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Xin Wang,et al. IDK Cascades: Fast Deep Learning by Learning not to Overthink , 2017, UAI.

[5] Ran El-Yaniv,et al. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..

[6] Tianqi Chen,et al. Net2Net: Accelerating Learning via Knowledge Transfer , 2015, ICLR.

[7] Enzo Baccarelli,et al. EcoMobiFog–Design and Dynamic Optimization of a 5G Mobile-Fog-Cloud Multi-Tier Ecosystem for the Real-Time Distributed Execution of Stream Applications , 2019, IEEE Access.

[8] Nazar Khan,et al. Machine Learning at the Network Edge: A Survey , 2019, ACM Comput. Surv..

[9] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[10] Yue Huang,et al. Bindctnet: A Simple Binary Dct Network for Image Classification , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11] Enzo Baccarelli,et al. Fog of Everything: Energy-Efficient Networked Computing Architectures, Research Challenges, and a Case Study , 2017, IEEE Access.

[12] H. T. Kung,et al. BranchyNet: Fast inference via early exiting from deep neural networks , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[13] H. T. Kung,et al. Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[14] Mohsen Guizani,et al. Deep Learning for IoT Big Data and Streaming Analytics: A Survey , 2017, IEEE Communications Surveys & Tutorials.

[15] Eugenio Culurciello,et al. An Analysis of Deep Neural Network Models for Practical Applications , 2016, ArXiv.

[16] Quoc V. Le,et al. CondConv: Conditionally Parameterized Convolutions for Efficient Inference , 2019, NeurIPS.

[17] Zhuowen Tu,et al. Deeply-Supervised Nets , 2014, AISTATS.

[18] Jonathon S. Hare,et al. Deep Cascade Learning , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[19] W. Marsden. I and J , 2012 .

[20] Danilo Comminiello,et al. Group sparse regularization for deep neural networks , 2016, Neurocomputing.

[21] Yuanming Shi,et al. Layer-wise Deep Neural Network Pruning via Iteratively Reweighted Optimization , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22] Jeffrey Humpherys,et al. Forward Thinking: Building and Training Neural Networks One Layer at a Time , 2017, ArXiv.