Sustainable Task Offloading in UAV Networks via Multi-Agent Reinforcement Learning

The recent growth of IoT devices, along with edge computing, has revealed many opportunities for novel applications. Among them, Unmanned Aerial Vehicles (UAVs), which are deployed for surveillance and environmental monitoring, are attracting increasing attention. In this context, typical solutions must deal with events that may change the state of the network, providing a service that continuously maintains a high level of performance. In this paper, we address this problem by proposing a distributed architecture that leverages a Multi-Agent Reinforcement Learning (MARL) technique to dynamically offload tasks from UAVs to the edge cloud. Nodes of the system co-operate to jointly minimize the overall latency perceived by the user and the energy usage on UAVs by continuously learning from the environment the best action, which entails the decision of offloading and, in this case, the best transmission technology, i.e., Wi-Fi or cellular. Results validate our distributed architecture and show the effectiveness of the approach in reaching the above targets.

[1]  David Silver,et al.  A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning , 2017, NIPS.

[2]  Xu Feng,et al.  Distributed Deep Learning-based Offloading for Mobile Edge Computing Networks , 2018, Mobile Networks and Applications.

[3]  Sergio Barbarossa,et al.  Communicating While Computing: Distributed mobile cloud computing over 5G heterogeneous networks , 2014, IEEE Signal Processing Magazine.

[4]  J. Baxter,et al.  Direct gradient-based reinforcement learning , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[5]  Flavio Esposito,et al.  Reunifying Families after a Disaster via Serverless Computing and Raspberry Pis , 2018, 2018 IEEE International Symposium on Local and Metropolitan Area Networks (LANMAN).

[6]  Michael L. Littman,et al.  Value-function reinforcement learning in Markov games , 2001, Cognitive Systems Research.

[7]  Flavio Esposito,et al.  APRON: an Architecture for Adaptive Task Planning of Internet of Things in Challenged Edge Networks , 2019, 2019 IEEE 8th International Conference on Cloud Networking (CloudNet).

[8]  Ying Jun Zhang,et al.  Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks , 2018, IEEE Transactions on Mobile Computing.

[9]  Roberto Riggio,et al.  Enabling Computation Offloading for Autonomous and Assisted Driving in 5G Networks , 2019, 2019 IEEE Global Communications Conference (GLOBECOM).

[10]  Weisong Shi,et al.  Edge Computing: Vision and Challenges , 2016, IEEE Internet of Things Journal.

[11]  Zibin Zheng,et al.  Joint Computation Offloading and Coin Loaning for Blockchain-Empowered Mobile-Edge Computing , 2019, IEEE Internet of Things Journal.

[12]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[13]  Ke Zhang,et al.  Computation Offloading and Resource Allocation For Cloud Assisted Mobile Edge Computing in Vehicular Networks , 2019, IEEE Transactions on Vehicular Technology.

[14]  Tamer Basar,et al.  Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents , 2018, ICML.

[15]  Nei Kato,et al.  Smart Resource Allocation for Mobile Edge Computing: A Deep Reinforcement Learning Approach , 2019, IEEE Transactions on Emerging Topics in Computing.

[16]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[17]  K. B. Letaief,et al.  A Survey on Mobile Edge Computing: The Communication Perspective , 2017, IEEE Communications Surveys & Tutorials.

[18]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[19]  Shalabh Bhatnagar,et al.  Incremental Natural Actor-Critic Algorithms , 2007, NIPS.

[20]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[21]  Jonathan P. How,et al.  Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability , 2017, ICML.

[22]  Yuanyuan Yang,et al.  Energy-efficient dynamic offloading and resource scheduling in mobile cloud computing , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[23]  Kaibin Huang,et al.  Energy-Efficient Resource Allocation for Mobile-Edge Computation Offloading , 2016, IEEE Transactions on Wireless Communications.

[24]  Martin Lauer,et al.  An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.

[25]  Khaled Ben Letaief,et al.  Dynamic Computation Offloading for Mobile-Edge Computing With Energy Harvesting Devices , 2016, IEEE Journal on Selected Areas in Communications.

[26]  Stefan Schaal,et al.  Natural Actor-Critic , 2003, Neurocomputing.

[27]  John N. Tsitsiklis,et al.  Distributed Asynchronous Deterministic and Stochastic Gradient Optimization Algorithms , 1984, 1984 American Control Conference.

[28]  Mohamed Ayoub Messous,et al.  Computation offloading game for an UAV network in mobile edge computing , 2017, 2017 IEEE International Conference on Communications (ICC).

[29]  Huimin Yu,et al.  Deep Reinforcement Learning for Offloading and Resource Allocation in Vehicle Edge Computing and Networks , 2019, IEEE Transactions on Vehicular Technology.

[30]  Geoffrey H. Kuenning,et al.  Saving portable computer battery power through remote process execution , 1998, MOCO.

[31]  Jie Zhang,et al.  Mobile-Edge Computation Offloading for Ultradense IoT Networks , 2018, IEEE Internet of Things Journal.

[32]  Ying Jun Zhang,et al.  Computation Rate Maximization for Wireless Powered Mobile-Edge Computing With Binary Computation Offloading , 2017, IEEE Transactions on Wireless Communications.

[33]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[34]  Mykel J. Kochenderfer,et al.  Cooperative Multi-agent Control Using Deep Reinforcement Learning , 2017, AAMAS Workshops.

[35]  Dario Pompili,et al.  Joint Task Offloading and Resource Allocation for Multi-Server Mobile-Edge Computing Networks , 2017, IEEE Transactions on Vehicular Technology.

[36]  Tao Zhang,et al.  Fog and IoT: An Overview of Research Opportunities , 2016, IEEE Internet of Things Journal.

[37]  Zibin Zheng,et al.  Online Deep Reinforcement Learning for Computation Offloading in Blockchain-Empowered Mobile Edge Computing , 2019, IEEE Transactions on Vehicular Technology.

[38]  Marc G. Bellemare,et al.  Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.

[39]  Hui Wang,et al.  Deep Reinforcement Learning-Based Adaptive Computation Offloading for MEC in Heterogeneous Vehicular Networks , 2020, IEEE Transactions on Vehicular Technology.

[40]  Wenzhong Li,et al.  Efficient Multi-User Computation Offloading for Mobile-Edge Cloud Computing , 2015, IEEE/ACM Transactions on Networking.

[41]  Michael P. Wellman,et al.  Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..

[42]  Hari Balakrishnan,et al.  Mahimahi: Accurate Record-and-Replay for HTTP , 2015, USENIX Annual Technical Conference.

[43]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[44]  Md Zakirul Alam Bhuiyan,et al.  Multiagent Deep Reinforcement Learning for Vehicular Computation Offloading in IoT , 2021, IEEE Internet of Things Journal.

[45]  Tony Q. S. Quek,et al.  Offloading in Mobile Edge Computing: Task Allocation and Computational Frequency Scaling , 2017, IEEE Transactions on Communications.

[46]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[47]  Symeon Papavassiliou,et al.  Edge Computing in IoT Ecosystems for UAV-Enabled Early Fire Detection , 2018, 2018 IEEE International Conference on Smart Computing (SMARTCOMP).

[48]  Sanghyun Ahn,et al.  Computation Offloading- Based Task Scheduling in the Vehicular Communication Environment for Computation-Intensive Vehicular Tasks , 2020, 2020 International Conference on Artificial Intelligence in Information and Communication (ICAIIC).

[49]  Nei Kato,et al.  Machine Learning Meets Computation and Communication Control in Evolving Edge and Cloud: Challenges and Future Perspective , 2020, IEEE Communications Surveys & Tutorials.

[50]  Flavio Esposito,et al.  Resource Inference for Task Migration in Challenged Edge Networks with RITMO , 2020, 2020 IEEE 9th International Conference on Cloud Networking (CloudNet).

[51]  Jukka K. Nurminen,et al.  Energy Efficiency of Mobile Clients in Cloud Computing , 2010, HotCloud.

[52]  Flavio Esposito,et al.  An architecture for adaptive task planning in support of IoT-based machine learning applications for disaster scenarios , 2020, Comput. Commun..

[53]  Waleed Meleis,et al.  QTCP: Adaptive Congestion Control with Reinforcement Learning , 2019, IEEE Transactions on Network Science and Engineering.

[54]  Flavio Esposito,et al.  A distributed reinforcement learning approach for energy and congestion-aware edge networks , 2020, CoNEXT.

[55]  Rajkumar Buyya,et al.  An Online Algorithm for Task Offloading in Heterogeneous Mobile Clouds , 2018, ACM Trans. Internet Techn..

[56]  Huawei Huang,et al.  Online Computation Offloading and Traffic Routing for UAV Swarms in Edge-Cloud Computing , 2020, IEEE Transactions on Vehicular Technology.

[57]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[58]  Shimon Whiteson,et al.  Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.

[59]  Antonio Pascual-Iserte,et al.  Optimization of Radio and Computational Resources for Energy Efficiency in Latency-Constrained Application Offloading , 2014, IEEE Transactions on Vehicular Technology.

[60]  Mugen Peng,et al.  Deep Reinforcement Learning-Based Mode Selection and Resource Management for Green Fog Radio Access Networks , 2018, IEEE Internet of Things Journal.

[61]  Weihua Zhuang,et al.  Learning-Based Computation Offloading for IoT Devices With Energy Harvesting , 2017, IEEE Transactions on Vehicular Technology.

[62]  Yan Zhang,et al.  Mobile Edge Computing: A Survey , 2018, IEEE Internet of Things Journal.

[63]  Sergey Levine,et al.  High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[64]  Dusit Niyato,et al.  A Dynamic Offloading Algorithm for Mobile Computing , 2012, IEEE Transactions on Wireless Communications.