BottleFit: Learning Compressed Representations in Deep Neural Networks for Effective and Efficient Split Computing

Although mission-critical applications require the use of deep neural networks (DNNs), their continuous execution at mobile devices results in a significant increase in energy consumption. While edge offloading can decrease energy consumption, erratic patterns in channel quality, network and edge server load can lead to severe disruption of the system’s key operations. An alternative approach, called split computing, generates compressed representations within the model (called "bottlenecks"), to reduce bandwidth usage and energy consumption. Prior work has proposed approaches that introduce additional layers, to the detriment of energy consumption and latency. For this reason, we propose a new framework called BottleFit, which, in addition to targeted DNN architecture modifications, includes a novel training strategy to achieve high accuracy even with strong compression rates. We apply BottleFit on cutting-edge DNN models in image classification, and show that BottleFit achieves 77.1% data compression with up to 0.6% accuracy loss on ImageNet dataset, while state of the art such as SPINN loses up to 6% in accuracy. We experimentally measure the power consumption and latency of an image classification application running on an NVIDIA Jetson Nano board (GPU-based) and a Raspberry PI board (GPU-less). We show that BottleFit decreases power consumption and latency respectively by up to 49% and 89% with respect to (w.r.t.) local computing and by 37% and 55% w.r.t. edge offloading. We also compare BottleFit with state-of-the-art autoencoders-based approaches, and show that (i) BottleFit reduces power consumption and execution time respectively by up to 54% and 44% on the Jetson and 40% and 62% on Raspberry PI; (ii) the size of the head model executed on the mobile device is 83 times smaller. We publish the code repository for reproducibility of the results in this study.

[1]  S. Mandt,et al.  SC2: Supervised Compression for Split Computing , 2022, ArXiv.

[2]  S. Mandt,et al.  Supervised Compression for Resource-Constrained Edge Computing Systems , 2021, 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[3]  M. Levorato,et al.  Neural Compression and Filtering for Edge-assisted Real-time Object Detection in Challenged Networks , 2020, 2020 25th International Conference on Pattern Recognition (ICPR).

[4]  Massoud Pedram,et al.  JointDNN: An Efficient Training and Inference Engine for Intelligent Mobile Cloud Computing Services , 2018, IEEE Transactions on Mobile Computing.

[5]  Marco Levorato,et al.  Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing Systems , 2020, IEEE Access.

[6]  Shuochao Yao,et al.  Deep compressive offloading: speeding up neural network inference by trading edge computation for network latency , 2020, SenSys.

[7]  Ilias Leontiadis,et al.  SPINN: synergistic progressive inference of neural networks over device and cloud , 2020, MobiCom.

[8]  Marco Levorato,et al.  Split Computing for Complex Object Detectors: Challenges and Preliminary Results , 2020, Proceedings of the 4th International Workshop on Embedded and Mobile Deep Learning.

[9]  Chunpeng Wu,et al.  MVStylizer: an efficient edge-assisted video photorealistic style transfer system for mobile phones , 2020, MobiHoc.

[10]  K. Mikolajczyk,et al.  Joint Device-Edge Inference over Wireless Links with Pruning , 2020, 2020 IEEE 21st International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[11]  2020 IEEE/ACM Fifth International Conference on Internet-of-Things Design and Implementation (IoTDI) , 2020 .

[12]  Bhaskar Krishnamachari,et al.  Fast and Accurate Streaming CNN Inference via Communication Compression on the Edge , 2020, 2020 IEEE/ACM Fifth International Conference on Internet-of-Things Design and Implementation (IoTDI).

[13]  Jun Zhang,et al.  BottleNet++: An End-to-End Approach for Feature Compression in Device-Edge Co-Inference Systems , 2019, 2020 IEEE International Conference on Communications Workshops (ICC Workshops).

[14]  Marco Levorato,et al.  Distilled Split Deep Neural Networks for Edge-Assisted Real-Time Systems , 2019, HotEdgeVideo@MOBICOM.

[15]  Silvio Savarese,et al.  Cracking open the DNN black-box: Video Analytics with DNNs across the Camera-Cloud Boundary , 2019, HotEdgeVideo@MOBICOM.

[16]  Xiufeng Xie,et al.  Source Compression with Bounded DNN Perception Loss for IoT Edge Computer Vision , 2019, MobiCom.

[17]  Xukan Ran,et al.  Deep Learning With Edge Computing: A Review , 2019, Proceedings of the IEEE.

[18]  Weigang Wu,et al.  Predictive Online Server Provisioning for Cost-Efficient IoT Data Streaming Across Collaborative Edges , 2019, MobiHoc.

[19]  Massoud Pedram,et al.  BottleNet: A Deep Learning Architecture for Intelligent Mobile Cloud Computing Services , 2019, 2019 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[20]  Bo Chen,et al.  MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jing Zhu,et al.  Will TCP Work in mmWave 5G Cellular Networks? , 2018, IEEE Communications Magazine.

[22]  Christos-Savvas Bouganis,et al.  Learning to Fly by MySelf: A Self-Supervised CNN-Based Approach for Autonomous Navigation , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[23]  Thomas H. Bradley,et al.  Advanced Driver-Assistance Systems: A Path Toward Autonomous Vehicles , 2018, IEEE Consumer Electronics Magazine.

[24]  Jonathan Rodriguez,et al.  Robust Mobile Crowd Sensing: When Deep Learning Meets Edge Computing , 2018, IEEE Network.

[25]  Ahmed H. Zahran,et al.  Beyond throughput: a 4G LTE dataset with channel and context metrics , 2018, MMSys.

[26]  Ivan V. Bajic,et al.  Deep Feature Compression for Collaborative Object Detection , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[27]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Bo Chen,et al.  Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Nikolai Smolyanskiy,et al.  Toward low-flying autonomous MAV trail navigation using deep neural networks for environmental awareness , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[30]  Trevor N. Mudge,et al.  Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge , 2017, ASPLOS.

[31]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Thomas Watteyne,et al.  Understanding the Limits of LoRaWAN , 2016, IEEE Communications Magazine.

[33]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[34]  Farzad Samie,et al.  IoT technologies for embedded computing: A survey , 2016, 2016 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[35]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[37]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[38]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.