EdgeBOL: automating energy-savings for mobile edge AI

Supporting Edge AI services is one of the most exciting features of future mobile networks. These services involve the collection and processing of voluminous data streams, right at the network edge, so as to offer real-time and accurate inferences to users. However, their widespread deployment is hampered by the energy cost they induce to the network. To overcome this obstacle, we propose a Bayesian learning framework for jointly configuring the service and the Radio Access Network (RAN), aiming to minimize the total energy consumption while respecting desirable accuracy and latency thresholds. Using a fully-fledged prototype with a software-defined base station (BS) and a GPU-enabled edge server, we profile a state-of-the-art video analytics AI service and identify new performance trade-offs. Accordingly, we tailor the optimization framework to account for the network context, the user needs, and the service metrics. The efficacy of our proposal is verified in a series of experiments and comparisons with neural network-based benchmarks.

[1]  David Duvenaud,et al.  Automatic model construction with Gaussian processes , 2014 .

[2]  Zhitang Chen,et al.  A Collaborative Learning Based Approach for Parameter Configuration of Cellular Networks , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[3]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[4]  Andrea Zanella,et al.  Online Learning for Energy Saving and Interference Coordination in HetNets , 2019, IEEE Journal on Selected Areas in Communications.

[5]  John N. Tsitsiklis,et al.  Linearly Parameterized Bandits , 2008, Math. Oper. Res..

[6]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[7]  D. Leith,et al.  Bayesian Online Learning for MEC Object Recognition Systems , 2020, GLOBECOM 2020 - 2020 IEEE Global Communications Conference.

[8]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[9]  Marco Fiore,et al.  DeepCog: Optimizing Resource Provisioning in Network Slicing With AI-Based Capacity Forecasting , 2020, IEEE Journal on Selected Areas in Communications.

[10]  Feng Lyu,et al.  Edge Coordinated Query Configuration for Low-Latency and Accurate Video Analytics , 2020, IEEE Transactions on Industrial Informatics.

[11]  A. Banchs,et al.  vrAIn: Deep Learning Based Orchestration for Computing and Radio Resources in vRANs , 2020, IEEE Transactions on Mobile Computing.

[12]  Cristina Cano,et al.  srsLTE: an open-source platform for LTE evolution and experimentation , 2016, WiNTECH@MobiCom.

[13]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[14]  Andres Garcia-Saavedra,et al.  LACO: A Latency-Driven Network Slicing Orchestration in Beyond-5G Networks , 2020, IEEE Transactions on Wireless Communications.

[15]  Pan Hui,et al.  Mobile Augmented Reality Survey: From Where We Are to Where We Go , 2017, IEEE Access.

[16]  X. Costa,et al.  O-RAN: Disrupting the Virtualized RAN Ecosystem , 2021, IEEE Communications Standards Magazine.

[17]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Heinz Koeppl,et al.  CBA: Contextual Quality Adaptation for Adaptive Bitrate Video Streaming , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[19]  Olivier Capp'e,et al.  Algorithms for Non-Stationary Generalized Linear Bandits , 2020, ArXiv.

[20]  Bo Han,et al.  Jaguar: Low Latency Mobile Augmented Reality with Flexible Tracking , 2018, ACM Multimedia.

[21]  Ramón Agüero,et al.  LaSR: A Supple Multi-Connectivity Scheduler for Multi-RAT OFDMA Systems , 2020, IEEE Transactions on Mobile Computing.

[22]  Ingrid Moerman,et al.  Cellular Access Multi-Tenancy through Small Cell Virtualization and Common RF Front-End Sharing , 2017, WiNTECH@MobiCom.

[23]  George Iosifidis,et al.  Joint Optimization of Edge Computing Architectures and Radio Access Networks , 2018, IEEE Journal on Selected Areas in Communications.

[24]  Hamed Haddadi,et al.  Deep Learning in Mobile and Wireless Networking: A Survey , 2018, IEEE Communications Surveys & Tutorials.

[25]  Sebastian Curi,et al.  Safe Contextual Bayesian Optimization for Sustainable Room Temperature PID Control Tuning , 2019, IJCAI.

[26]  Peter Rost,et al.  CARES: Computation-Aware Scheduling in Virtualized Radio Access Networks , 2018, IEEE Transactions on Wireless Communications.

[27]  Ion Stoica,et al.  Chameleon: scalable adaptation of video analytics , 2018, SIGCOMM.

[28]  Juan J. Alcaraz,et al.  Online reinforcement learning for adaptive interference coordination , 2020, Trans. Emerg. Telecommun. Technol..

[29]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Justin Manweiler,et al.  OverLay: Practical Mobile Augmented Reality , 2015, MobiSys.

[31]  Marco Fiore,et al.  How Should I Slice My Network?: A Multi-Service Empirical Evaluation of Resource Sharing Efficiency , 2018, MobiCom.

[32]  Konstantinos Poularakis,et al.  DQ Scheduler: Deep Reinforcement Learning Based Controller Synchronization in Distributed SDN , 2018, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[33]  Paramvir Bahl,et al.  VideoEdge: Processing Camera Streams using Hierarchical Clusters , 2018, 2018 IEEE/ACM Symposium on Edge Computing (SEC).

[34]  Alkis Gotovos,et al.  Safe Exploration for Optimization with Gaussian Processes , 2015, ICML.

[35]  Yongbo Li,et al.  MobiQoR: Pushing the Envelope of Mobile Edge Computing Via Quality-of-Result Optimization , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[36]  Rittwik Jana,et al.  On Leveraging Machine and Deep Learning for Throughput Prediction in Cellular Networks: Design, Performance, and Challenges , 2020, IEEE Communications Magazine.

[37]  Marco Gruteser,et al.  Edge Assisted Real-time Object Detection for Mobile Augmented Reality , 2019, MobiCom.

[38]  X. Costa,et al.  Nuberu: reliable RAN virtualization in shared platforms , 2021, MobiCom.

[39]  Qiang Liu,et al.  DARE: Dynamic Adaptive Mobile Augmented Reality with Edge Computing , 2018, 2018 IEEE 26th International Conference on Network Protocols (ICNP).

[40]  Marco Gramaglia,et al.  Resource Sharing Efficiency in Network Slicing , 2019, IEEE Transactions on Network and Service Management.

[41]  Ness B. Shroff,et al.  Efficient Beam Alignment in Millimeter Wave Systems Using Contextual Bandits , 2017, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[42]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[43]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[44]  Andres Garcia-Saavedra,et al.  OVNES: Demonstrating 5G network slicing overbooking on real deployments , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[45]  Joel W. Burdick,et al.  Stagewise Safe Bayesian Optimization with Gaussian Processes , 2018, ICML.

[46]  Christos Thrampoulidis,et al.  Regret Bounds for Safe Gaussian Process Bandit Optimization , 2020, 2021 IEEE International Symposium on Information Theory (ISIT).

[47]  Tony Q. S. Quek,et al.  Service Multiplexing and Revenue Maximization in Sliced C-RAN Incorporated With URLLC and Multicast eMBB , 2019, IEEE Journal on Selected Areas in Communications.

[48]  Andreas Krause,et al.  Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics , 2016, Machine Learning.

[49]  Adam D. Bull,et al.  Convergence Rates of Efficient Global Optimization Algorithms , 2011, J. Mach. Learn. Res..

[50]  Yonggang Wen,et al.  JALAD: Joint Accuracy-And Latency-Aware Deep Structure Decoupling for Edge-Cloud Execution , 2018, 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS).

[51]  Richard Evans,et al.  Deep Reinforcement Learning in Large Discrete Action Spaces , 2015, 1512.07679.

[52]  Sampath Rangarajan,et al.  SkyRAN: a self-organizing LTE RAN in the sky , 2018, CoNEXT.

[53]  Zhenming Liu,et al.  DeepDecision: A Mobile Deep Learning Framework for Edge Video Analytics , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[54]  Aurélien Garivier,et al.  Parametric Bandits: The Generalized Linear Case , 2010, NIPS.

[55]  Ke Wang,et al.  Computing aware scheduling in mobile edge computing system , 2019, Wireless Networks.

[56]  Andreas Krause,et al.  Contextual Gaussian Process Bandit Optimization , 2011, NIPS.

[57]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[58]  Brian L. Evans,et al.  A Framework for Automated Cellular Network Tuning With Reinforcement Learning , 2018, IEEE Transactions on Communications.