AI on the Edge: Rethinking AI-based IoT Applications Using Specialized Edge Architectures

Edge computing has emerged as a popular paradigm for supporting mobile and IoT applications with low latency or high bandwidth needs. The attractiveness of edge computing has been further enhanced due to the recent availability of special-purpose hardware to accelerate specific compute tasks, such as deep learning inference, on edge nodes. In this paper, we experimentally compare the benefits and limitations of using specialized edge systems, built using edge accelerators, to more traditional forms of edge and cloud computing. Our experimental study using edge-based AI workloads shows that today's edge accelerators can provide comparable, and in many cases better, performance, when normalized for power or cost, than traditional edge and cloud servers. They also provide latency and bandwidth benefits for split processing, across and within tiers, when using model compression or model splitting, but require dynamic methods to determine the optimal split across tiers. We find that edge accelerators can support varying degrees of concurrency for multi-tenant inference applications, but lack isolation mechanisms necessary for edge cloud multi-tenant hosting.

[1]  Soo-Mook Moon,et al.  IONN: Incremental Offloading of Neural Network Computations from Mobile Devices to Edge Servers , 2018, SoCC.

[2]  Paramvir Bahl,et al.  The Case for VM-Based Cloudlets in Mobile Computing , 2009, IEEE Pervasive Computing.

[3]  Zheng Dong,et al.  An energy-efficient offloading framework with predictable temporal correctness , 2017, SEC.

[4]  Trevor N. Mudge,et al.  Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge , 2017, ASPLOS.

[5]  Mahadev Satyanarayanan,et al.  The Emergence of Edge Computing , 2017, Computer.

[6]  Srikumar Venugopal,et al.  Shadow Puppets: Cloud-level Accurate AI Inference at the Speed and Economy of Edge , 2018, HotEdge.

[7]  H. T. Kung,et al.  Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[8]  Zhuo Chen,et al.  Bandwidth-Efficient Live Video Analytics for Drones Via Edge Computing , 2018, 2018 IEEE/ACM Symposium on Edge Computing (SEC).

[9]  Raghuraman Krishnamoorthi,et al.  Quantizing deep convolutional networks for efficient inference: A whitepaper , 2018, ArXiv.

[10]  Paramvir Bahl,et al.  VideoEdge: Processing Camera Streams using Hierarchical Clusters , 2018, 2018 IEEE/ACM Symposium on Edge Computing (SEC).

[11]  Dipankar Raychaudhuri,et al.  Towards efficient edge cloud augmentation for virtual reality MMOGs , 2017, SEC.

[12]  Alec Wolman,et al.  MAUI: making smartphones last longer with code offload , 2010, MobiSys '10.

[13]  Tarek F. Abdelzaher,et al.  FastDeepIoT: Towards Understanding and Optimizing Neural Network Execution Time on Mobile and Embedded Devices , 2018, SenSys.

[14]  Yifan Wang,et al.  pCAMP: Performance Comparison of Machine Learning Packages on the Edges , 2019, HotEdge.

[15]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[16]  Tara N. Sainath,et al.  Convolutional neural networks for small-footprint keyword spotting , 2015, INTERSPEECH.

[17]  Sokol Kosta,et al.  To offload or not to offload? The bandwidth and energy costs of mobile cloud computing , 2013, 2013 Proceedings IEEE INFOCOM.

[18]  Giovanni Pau,et al.  Parkmaster: an in-vehicle, edge-based video analytics service for detecting open parking spaces in urban environments , 2017, SEC.

[19]  Paramvir Bahl,et al.  Video Analytics - Killer App for Edge Computing , 2019, MobiSys.

[20]  Hao Wen,et al.  Distributing Deep Neural Networks with Containerized Partitions at the Edge , 2019, HotEdge.

[21]  Eyal de Lara,et al.  Cloudpath: a multi-tier cloud computing framework , 2017, SEC.

[22]  Mahadev Satyanarayanan,et al.  An empirical study of latency in an emerging class of edge computing applications for wearable cognitive assistance , 2017, SEC.

[23]  Weisong Shi,et al.  LAVEA: latency-aware video analytics on edge computing platform , 2017, SEC.

[24]  Andreas Gerstlauer,et al.  DeepThings: Distributed Adaptive Deep Learning Inference on Resource-Constrained IoT Edge Clusters , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[25]  Ramesh Govindan,et al.  Real-time traffic estimation at vehicular edge nodes , 2017, SEC.

[26]  Dexmont Peña,et al.  Benchmarking of CNNs for Low-Cost , Low-Power Robotics Applications , 2010 .

[27]  Peter Blouw,et al.  Benchmarking Keyword Spotting Efficiency on Neuromorphic Hardware , 2018, NICE '19.