Distributed Resource Scheduling for Large-Scale MEC Systems: A Multiagent Ensemble Deep Reinforcement Learning With Imitation Acceleration

We consider the optimization of distributed resource scheduling to minimize the sum of task latency and energy consumption for all the Internet of things devices (IoTDs) in a large-scale mobile edge computing (MEC) system. To address this problem, we propose a distributed intelligent resource scheduling (DIRS) framework, which includes centralized training relying on the global information and distributed decision making by each agent deployed in each MEC server. More specifically, we first introduce a novel multi-agent ensemble-assisted distributed deep reinforcement learning (DRL) architecture, which can simplify the overall neural network structure of each agent by partitioning the state space and also improve the performance of a single agent by combining decisions of all the agents. Secondly, we apply action refinement to enhance the exploration ability of the proposed DIRS framework, where the near-optimal state-action pairs are obtained by a novel Levy flight search. Finally, an imitation acceleration scheme is presented to pre-train all the agents, which can significantly accelerate the learning process of the proposed framework through learning the professional experience from a small amount of demonstration data. Extensive simulations are conducted to demonstrate that the proposed DIRS framework is efficient and outperforms the existing benchmark schemes.

[1]  Mehdi Bennis,et al.  Intelligent Edge: Leveraging Deep Imitation Learning for Mobile Edge Computation Offloading , 2020, IEEE Wireless Communications.

[2]  Tao Li,et al.  A Framework for Partitioning and Execution of Data Stream Applications in Mobile Cloud Computing , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[3]  Gang Feng,et al.  iRAF: A Deep Reinforcement Learning Approach for Collaborative Mobile Edge Computing IoT Networks , 2019, IEEE Internet of Things Journal.

[4]  Kun Yang,et al.  Deep-Learning-Based Joint Resource Scheduling Algorithms for Hybrid MEC Networks , 2019, IEEE Internet of Things Journal.

[5]  Li Dong,et al.  Using wavelet packet denoising and ANFIS networks based on COSFLA optimization for electrical resistivity imaging inversion , 2017, Fuzzy Sets Syst..

[6]  Mehdi Bennis,et al.  Optimized Computation Offloading Performance in Virtual Edge Computing Systems Via Deep Reinforcement Learning , 2018, IEEE Internet of Things Journal.

[7]  R. Mantegna,et al.  Fast, accurate algorithm for numerical simulation of Lévy stable stochastic processes. , 1994, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[8]  K. B. Letaief,et al.  A Survey on Mobile Edge Computing: The Communication Perspective , 2017, IEEE Communications Surveys & Tutorials.

[9]  Abdelhakim Hafid,et al.  Decentralized data offloading for mobile cloud computing based on game theory , 2017, 2017 Second International Conference on Fog and Mobile Edge Computing (FMEC).

[10]  Ying Jun Zhang,et al.  Computation Rate Maximization for Wireless Powered Mobile-Edge Computing With Binary Computation Offloading , 2017, IEEE Transactions on Wireless Communications.

[11]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[12]  Yanjun Li,et al.  Analyzing the robotic behavior in a smart city with deep enforcement and imitation learning using IoRT , 2020, Comput. Commun..

[13]  Weihua Zhuang,et al.  Learning-Based Computation Offloading for IoT Devices With Energy Harvesting , 2017, IEEE Transactions on Vehicular Technology.

[14]  Nan Zhao,et al.  Integrated Networking, Caching, and Computing for Connected Vehicles: A Deep Reinforcement Learning Approach , 2018, IEEE Transactions on Vehicular Technology.

[15]  Li Dong,et al.  Electrical resistivity imaging inversion: An ISFLA trained kernel principal component wavelet neural network approach , 2018, Neural Networks.

[16]  Kun Yang,et al.  Stacked Autoencoder-Based Deep Reinforcement Learning for Online Resource Scheduling in Large-Scale MEC Networks , 2020, IEEE Internet of Things Journal.

[17]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[18]  Tony Q. S. Quek,et al.  Offloading in Mobile Edge Computing: Task Allocation and Computational Frequency Scaling , 2017, IEEE Transactions on Communications.

[19]  Geoffrey Ye Li,et al.  Deep Reinforcement Learning Based Resource Allocation for V2V Communications , 2018, IEEE Transactions on Vehicular Technology.

[20]  Chi Harold Liu,et al.  Energy-Efficient UAV Control for Effective and Fair Communication Coverage: A Deep Reinforcement Learning Approach , 2018, IEEE Journal on Selected Areas in Communications.

[21]  Eid Emary,et al.  Impact of Lèvy flight on modern meta-heuristic optimizers , 2019, Appl. Soft Comput..

[22]  Ying Jun Zhang,et al.  Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks , 2018, IEEE Transactions on Mobile Computing.

[23]  João Gama,et al.  Ensemble learning for data stream analysis: A survey , 2017, Inf. Fusion.

[24]  Tom Schaul,et al.  Deep Q-learning From Demonstrations , 2017, AAAI.

[25]  Xianfu Chen,et al.  GAN-Based Deep Distributional Reinforcement Learning for Resource Management in Network Slicing , 2019, 2019 IEEE Global Communications Conference (GLOBECOM).

[26]  Qingqi Pei,et al.  Cooperative Computation Offloading and Resource Allocation for Blockchain-Enabled Mobile-Edge Computing: A Deep Reinforcement Learning Approach , 2020, IEEE Internet of Things Journal.

[27]  Richard Evans,et al.  Deep Reinforcement Learning in Large Discrete Action Spaces , 2015, 1512.07679.

[28]  Shin-Ming Cheng,et al.  eNB Selection for Machine Type Communications Using Reinforcement Learning Based Markov Decision Process , 2017, IEEE Transactions on Vehicular Technology.

[29]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[30]  Kezhi Wang,et al.  Joint Energy Minimization and Resource Allocation in C-RAN with Mobile Cloud , 2015, IEEE Transactions on Cloud Computing.

[31]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[32]  Zhao Chen,et al.  Decentralized computation offloading for multi-user mobile edge computing: a deep reinforcement learning approach , 2018, EURASIP Journal on Wireless Communications and Networking.