Complexity-aware Adaptive Training and Inference for Edge-Cloud Distributed AI Systems

The ubiquitous use of IoT and machine learning applications is creating large amounts of data that require accurate and real-time processing. Although edge-based smart data processing can be enabled by deploying pretrained models, the energy and memory constraints of edge devices necessitate distributed deep learning between the edge and the cloud for complex data. In this paper, we propose a distributed system to exploit both the edge and the cloud for training and inference. We propose a new architecture, MEANet, with a main block, an extension block, and an adaptive block for the edge. The inference process can terminate at either the main block, the extension block, or the cloud. MEANet is trained to categorize inputs into easy/hard/complex classes. The main block identifies instances of easy/hard classes and classifies easy classes with high confidence. Only data with high probabilities of belonging to hard classes would be sent to the extension block for prediction. Further, only if the neural network at the edge shows low confidence in the prediction, the instance is considered complex and sent to the cloud for further processing. The training technique lends to the majority of inference on edge devices while going to the cloud only for a small set of complex jobs. The performance of the proposed system is evaluated via extensive experiments using modified models of ResNets and MobileNetV2 on CIFAR-100 and ImageNet datasets. The results show that the proposed distributed model has improved accuracy and lower energy consumption compared to standard models, indicating its capacity to adapt.

[1]  Yuanzhou Yang,et al.  Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes , 2018, ArXiv.

[2]  Rong Zheng,et al.  TeamNet: A Collaborative Inference Framework on the Edge , 2019, 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS).

[3]  Kaushik Roy,et al.  Conditionally Deep Hybrid Neural Networks Across Edge and Cloud , 2020, ArXiv.

[4]  Alexander J. Smola,et al.  Scaling Distributed Machine Learning with the Parameter Server , 2014, OSDI.

[5]  Massoud Pedram,et al.  Energy and Performance Efficient Computation Offloading for Deep Neural Networks in a Mobile Cloud Computing Environment , 2018, ACM Great Lakes Symposium on VLSI.

[6]  Swagath Venkataramani,et al.  PACT: Parameterized Clipping Activation for Quantized Neural Networks , 2018, ArXiv.

[7]  Yoshua Bengio,et al.  BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 , 2016, ArXiv.

[8]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[9]  Ilias Leontiadis,et al.  SPINN: synergistic progressive inference of neural networks over device and cloud , 2020, MobiCom.

[10]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[11]  Kin K. Leung,et al.  When Edge Meets Learning: Adaptive Control for Resource-Constrained Distributed Machine Learning , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[12]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[13]  Bo Chen,et al.  Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Qinglin Wang,et al.  HPDL: Towards a General Framework for High-performance Distributed Deep Learning , 2019, 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS).

[15]  Trevor N. Mudge,et al.  Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge , 2017, ASPLOS.

[16]  Kaushik Roy,et al.  Conditional Deep Learning for energy-efficient and enhanced pattern recognition , 2015, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[17]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Hanan Samet,et al.  Pruning Filters for Efficient ConvNets , 2016, ICLR.

[19]  Andreas Gerstlauer,et al.  DeepThings: Distributed Adaptive Deep Learning Inference on Resource-Constrained IoT Edge Clusters , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[20]  Kaushik Roy,et al.  A Low Effort Approach to Structured CNN Design Using PCA , 2018, IEEE Access.

[21]  Lars Bauer,et al.  From Cloud Down to Things: An Overview of Machine Learning in Internet of Things , 2019, IEEE Internet of Things Journal.

[22]  Yee Whye Teh,et al.  Progress & Compress: A scalable framework for continual learning , 2018, ICML.

[23]  Pietro Perona,et al.  Deciding How to Decide: Dynamic Routing in Artificial Neural Networks , 2017, ICML.

[24]  Marc'Aurelio Ranzato,et al.  Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[25]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[26]  Shuchang Zhou,et al.  DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients , 2016, ArXiv.

[27]  Jason Weston,et al.  Scaling Learning Algorithms toward AI , 2007 .

[28]  Kaushik Roy,et al.  Incremental Learning in Deep Convolutional Neural Networks Using Partial Network Sharing , 2017, IEEE Access.

[29]  Larry S. Davis,et al.  BlockDrop: Dynamic Inference Paths in Residual Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30]  Kaushik Roy,et al.  Constructing energy-efficient mixed-precision neural networks through principal component analysis for edge intelligence , 2019, Nat. Mach. Intell..

[31]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[32]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Alexander J. Smola,et al.  Parallelized Stochastic Gradient Descent , 2010, NIPS.

[34]  Muhammad Shafique,et al.  An overview of next-generation architectures for machine learning: Roadmap, opportunities and challenges in the IoT era , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[35]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  H. T. Kung,et al.  BranchyNet: Fast inference via early exiting from deep neural networks , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[37]  Feng Qian,et al.  A close examination of performance and power characteristics of 4G LTE networks , 2012, MobiSys '12.

[38]  Ruigang Yang,et al.  Improved Techniques for Training Adaptive Deep Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[39]  Marc'Aurelio Ranzato,et al.  Large Scale Distributed Deep Networks , 2012, NIPS.

[40]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[41]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42]  H. T. Kung,et al.  Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[43]  Ananthram Swami,et al.  Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples , 2016, ArXiv.

[44]  Tian Guo,et al.  EPNet: Learning to Exit with Flexible Multi-Branch Network , 2020, CIKM.

[45]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[46]  Kaushik Roy,et al.  BlocTrain: Block-Wise Conditional Training and Inference for Efficient Spike-Based Deep Learning , 2021, Frontiers in Neuroscience.