BottleNet++: An End-to-End Approach for Feature Compression in Device-Edge Co-Inference Systems

The emergence of various intelligent mobile applications demands the deployment of powerful deep learning models at resource-constrained mobile devices. The device-edge co-inference framework provides a promising solution by splitting a neural network at a mobile device and an edge computing server. In order to balance the on-device computation and the communication overhead, the splitting point needs to be carefully picked, while the intermediate feature needs to be compressed before transmission. Existing studies decoupled the design of model splitting, feature compression, and communication, which may lead to excessive resource consumption of the mobile device. In this paper, we introduce an end-to-end architecture, named BottleNet++, that consists of an encoder, a non-trainable channel layer, and a decoder for more efficient feature compression and transmission. The encoder and decoder essentially implement joint source-channel coding via lightweight convolutional neural networks (CNNs), while explicitly considering the effect of channel noise. By exploiting the strong sparsity and the fault-tolerant property of the intermediate feature in deep neural network (DNNs), BottleNet++ achieves a much higher compression ratio than existing methods. Compared with merely transmitting intermediate data without feature compression, BottleNet++ achieves up to 64× bandwidth reduction over the additive white Gaussian noise channel and up to 256× bit compression ratio in the binary erasure channel, with less than 2% reduction in accuracy of classification.

[1]  Tao Zhang,et al.  Model Compression and Acceleration for Deep Neural Networks: The Principles, Progress, and Challenges , 2018, IEEE Signal Processing Magazine.

[2]  Yang Zhang,et al.  Improving Device-Edge Cooperative Inference of Deep Learning via 2-Step Pruning , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[3]  Amin Shokrollahi,et al.  LDPC Codes: An Introduction , 2004 .

[4]  Massoud Pedram,et al.  JointDNN: An Efficient Training and Inference Engine for Intelligent Mobile Cloud Computing Services , 2018, IEEE Transactions on Mobile Computing.

[5]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[6]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[7]  Jitendra Malik,et al.  Region-Based Convolutional Networks for Accurate Object Detection and Segmentation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Andrea Goldsmith Joint source/channel coding for wireless channels , 1995, 1995 IEEE 45th Vehicular Technology Conference. Countdown to the Wireless Twenty-First Century.

[9]  Deniz Gündüz,et al.  Deep Joint Source-channel Coding for Wireless Image Transmission , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Aggelos K. Katsaggelos,et al.  Joint Source-Channel Coding for Video Communications , 2005 .

[11]  Robert I. Damper,et al.  Determining and improving the fault tolerance of multilayer perceptrons in a pattern-recognition application , 1993, IEEE Trans. Neural Networks.

[12]  Saibal Mukhopadhyay,et al.  Edge-Host Partitioning of Deep Neural Networks with Feature Space Encoding for Resource-Constrained Internet-of-Things Platforms , 2018, 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[13]  Karolj Skala,et al.  Scalable Distributed Computing Hierarchy: Cloud, Fog and Dew Computing , 2015, Open J. Cloud Comput..

[14]  Mohsen Guizani,et al.  Internet of Things: A Survey on Enabling Technologies, Protocols, and Applications , 2015, IEEE Communications Surveys & Tutorials.

[15]  Ivan V. Bajic,et al.  Deep Feature Compression for Collaborative Object Detection , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[16]  Yonggang Wen,et al.  JALAD: Joint Accuracy-And Latency-Aware Deep Structure Decoupling for Edge-Cloud Execution , 2018, 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS).

[17]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[19]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Ivan V. Bajic,et al.  Near-Lossless Deep Feature Compression for Collaborative Intelligence , 2018, 2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP).

[21]  Andrea J. Goldsmith,et al.  Deep Learning for Joint Source-Channel Coding of Text , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22]  Marco Levorato,et al.  Distilled Split Deep Neural Networks for Edge-Assisted Real-Time Systems , 2019, HotEdgeVideo@MOBICOM.

[23]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[24]  Trevor N. Mudge,et al.  Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge , 2017, ASPLOS.

[25]  Stefano Ermon,et al.  Neural Joint Source-Channel Coding , 2018, ICML.

[26]  Massoud Pedram,et al.  BottleNet: A Deep Learning Architecture for Intelligent Mobile Cloud Computing Services , 2019, 2019 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).