Online Service Migration in Edge Computing with Incomplete Information: A Deep Recurrent Actor-Critic Method

Multi-access Edge Computing (MEC) is a key technology in the fifth-generation (5G) network and beyond. MEC extends cloud computing to the network edge (e.g., base stations, MEC servers) to support emerging resource-intensive applications on mobile devices. As a crucial problem in MEC, service migration needs to decide where to migrate user services for maintaining high Quality-of-Service (QoS), when users roam between MEC servers with limited coverage and capacity. However, finding an optimal migration policy is intractable due to the highly dynamic MEC environment and user mobility. Many existing works make centralized migration decisions based on complete system-level information, which can be time-consuming and suffer from the scalability issue with the rapidly increasing number of mobile users. To address these challenges, we propose a new learning-driven method, namely Deep Recurrent Actor-Critic based service Migration (DRACM), which is user-centric and can make effective online migration decisions given incomplete system-level information. Specifically, the service migration problem is modeled as a Partially Observable Markov Decision Process (POMDP). To solve the POMDP, we design an encoder network that combines a Long Short-Term Memory (LSTM) and an embedding matrix for effective extraction of hidden information. We then propose a tailored off-policy actor-critic algorithm with a clipped surrogate objective for efficient training. Results from extensive experiments based on real-world mobility traces demonstrate that our method consistently outperforms both the heuristic and state-of-the-art learning-driven algorithms, and achieves near-optimal results on various MEC scenarios.

[1]  Matthias Grossglauser,et al.  CRAWDAD dataset epfl/mobility (v.2009-02-24) , 2009 .

[2]  Sergey Levine,et al.  SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning , 2018, ICML.

[3]  Wenzhong Li,et al.  Efficient Multi-User Computation Offloading for Mobile-Edge Cloud Computing , 2015, IEEE/ACM Transactions on Networking.

[4]  Xavier Masip-Bruin,et al.  A Survey on Mobility-Induced Service Migration in the Fog, Edge, and Related Computing Paradigms , 2019, ACM Comput. Surv..

[5]  Xu Chen,et al.  Adaptive User-managed Service Placement for Mobile Edge Computing: An Online Learning Approach , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[6]  Martin A. Riedmiller,et al.  Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.

[7]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[8]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[9]  Sergey Levine,et al.  High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[10]  Peter Stone,et al.  Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.

[11]  Ao Zhou,et al.  Path Selection for Seamless Service Migration in Vehicular Edge Computing , 2020, IEEE Internet of Things Journal.

[12]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[13]  Sergey Levine,et al.  Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[14]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[15]  Yu Liu,et al.  A First Look at Commercial 5G Performance on Smartphones , 2020, WWW.

[16]  Marco Pavone,et al.  Cellular Network Traffic Scheduling With Deep Reinforcement Learning , 2018, AAAI.

[17]  Tao Lin,et al.  A Joint Service Migration and Mobility Optimization Approach for Vehicular Edge Computing , 2020, IEEE Transactions on Vehicular Technology.

[18]  Kin K. Leung,et al.  Dynamic Service Migration in Mobile Edge Computing Based on Markov Decision Process , 2019, IEEE/ACM Transactions on Networking.

[19]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[20]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21]  Jie Xu,et al.  EMM: Energy-Aware Mobility Management for Mobile Edge Computing in Ultra Dense Networks , 2017, IEEE Journal on Selected Areas in Communications.

[22]  Wei Li,et al.  A Dynamic Service Migration Mechanism in Edge Cognitive Computing , 2018, ACM Trans. Internet Techn..

[23]  Ning Zhang,et al.  Delay-Aware Microservice Coordination in Mobile Edge Computing: A Reinforcement Learning Approach , 2019, IEEE Transactions on Mobile Computing.

[24]  Shanhe Yi,et al.  Efficient Live Migration of Edge Services Leveraging Container Layered Storage , 2019, IEEE Transactions on Mobile Computing.

[25]  Lorenzo Bracciale,et al.  CRAWDAD dataset roma/taxi (v.2014-07-17) , 2014 .

[26]  Shimon Whiteson,et al.  Deep Variational Reinforcement Learning for POMDPs , 2018, ICML.

[27]  Xu Chen,et al.  Follow Me at the Edge: Mobility-Aware Dynamic Service Placement for Mobile Edge Computing , 2018, 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS).

[28]  Hao Wu,et al.  Mastering Complex Control in MOBA Games with Deep Reinforcement Learning , 2019, AAAI.

[29]  Jeongho Kwak,et al.  DREAM: Dynamic Resource and Task Allocation for Energy Minimization in Mobile Cloud Systems , 2015, IEEE Journal on Selected Areas in Communications.

[30]  Pascal Poupart,et al.  On Improving Deep Reinforcement Learning for POMDPs , 2017, ArXiv.

[31]  Chih-Yu Wang,et al.  Mobility-Aware Deep Reinforcement Learning with Glimpse Mobility Prediction in Edge Computing , 2020, ICC 2020 - 2020 IEEE International Conference on Communications (ICC).

[32]  K. B. Letaief,et al.  A Survey on Mobile Edge Computing: The Communication Perspective , 2017, IEEE Communications Surveys & Tutorials.