DeepHop on Edge: Hop-by-hop Routing byDistributed Learning with Semantic Attention

Multi-access Edge Computing (MEC) and ubiquitous smart devices help serve end-users efficiently and optimally through providing emerging edge-deployed services. Meanwhile, heavy and time-varying traffic loads are produced in the edge network, so that an efficient traffic forwarding mechanism is required. In this paper, we propose a parallel and distributed learning approach, DeepHop, to adapt to the volatile environments and realize hop-by-hop routing. The Multi-Agent Deep Reinforcement Learning (MADRL) is used to alleviate the edge network congestion and maximize the utilization of network resources. DeepHop determines the routing among edge network nodes for heterogeneous types of traffic according to the current workload and capability. By joining with an attention mechanism, DeepHop obtains the semantics from the elements of the network state to help the agents learn the importance of each element on routing. Experiment results show that DeepHop achieves the increase of successfully transmitted packets by 15% compared with the state-of-the-art algorithms. Besides, DeepHop with an attention mechanism reduces convergence time by nearly half compared with the common-used structures of neural networks.

[1]  A. Tulino,et al.  Joint Service Placement and Request Routing in Multi-cell Mobile Edge Computing Networks , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[2]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[3]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[4]  Navrati Saxena,et al.  MDP-IoT: MDP based interest forwarding for heterogeneous traffic in IoT-NDN environment , 2018, Future Gener. Comput. Syst..

[5]  Chi Harold Liu,et al.  Experience-driven Networking: A Deep Reinforcement Learning based Approach , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[6]  Ali A. Ghorbani,et al.  Characterization of Tor Traffic using Time based Features , 2017, ICISSP.

[7]  Wenzhong Li,et al.  ReLeS: A Neural Adaptive Multipath Scheduler based on Deep Reinforcement Learning , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[8]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[9]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[10]  Zongqing Lu,et al.  Graph Convolutional Reinforcement Learning for Multi-Agent Cooperation , 2018, ArXiv.

[11]  Ao Tang,et al.  HALO: Hop-by-Hop Adaptive Link-State Optimal Routing , 2015, IEEE/ACM Transactions on Networking.

[12]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[13]  Mohamed-Slim Alouini,et al.  A Survey of Channel Modeling for UAV Communications , 2018, IEEE Communications Surveys & Tutorials.

[14]  Vijay R. Konda,et al.  OnActor-Critic Algorithms , 2003, SIAM J. Control. Optim..

[15]  Bryan Ng,et al.  TTL-Based Efficient Forwarding for Nanonetworks With Multiple Coordinated IoT Gateways , 2018, IEEE Internet of Things Journal.

[16]  Jiann-Liang Chen,et al.  5G Virtualized Multi-access Edge Computing Platform for IoT Applications , 2018, J. Netw. Comput. Appl..

[17]  Wei Wang,et al.  CRF: Coexistent Routing and Flooding using WiFi Packets in Heterogeneous IoT Networks , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[18]  Zenghua Zhao,et al.  Optimal Data Caching and Forwarding in Industrial IoT With Diverse Connectivity , 2019, IEEE Transactions on Industrial Informatics.

[19]  Wei Ni,et al.  Stochastic Online Learning for Mobile Edge Computing: Learning from Changes , 2019, IEEE Communications Magazine.

[20]  Elman Mansimov,et al.  Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation , 2017, NIPS.

[21]  Matthew Roughan,et al.  The Internet Topology Zoo , 2011, IEEE Journal on Selected Areas in Communications.

[22]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[23]  Peilin Hong,et al.  Virtual network function placement and resource optimization in NFV and edge computing enabled networks , 2019, Comput. Networks.

[24]  Bart De Schutter,et al.  Multi-agent Reinforcement Learning: An Overview , 2010 .

[25]  Rohan Ramanath,et al.  An Attentive Survey of Attention Models , 2019, ACM Trans. Intell. Syst. Technol..

[26]  S. M. García,et al.  2014: , 2020, A Party for Lazarus.

[27]  Dmitrii Chemodanov,et al.  Energy-Aware Mobile Edge Computing and Routing for Low-Latency Visual Data Processing , 2018, IEEE Transactions on Multimedia.

[28]  Jingyu Wang,et al.  Intelligent VNFs Selection Based on Traffic Identification in Vehicular Cloud Networks , 2019, IEEE Transactions on Vehicular Technology.

[29]  Florence March,et al.  2016 , 2016, Affair of the Heart.

[30]  Sham M. Kakade,et al.  A Natural Policy Gradient , 2001, NIPS.

[31]  Zhiyuan Xu,et al.  Experience-Driven Congestion Control: When Multi-Path TCP Meets Deep Reinforcement Learning , 2019, IEEE Journal on Selected Areas in Communications.

[32]  Wazir Zada Khan,et al.  Edge computing: A survey , 2019, Future Gener. Comput. Syst..