Robust Path Selection in Software-defined WANs using Deep Reinforcement Learning

In the context of an efficient network traffic engineering process where the network continuously measures a new traffic matrix and updates the set of paths in the network, an automated process is required to quickly and efficiently identify when and what set of paths should be used. Unfortunately, the burden of finding the optimal solution for the network updating process in each given time interval is high since the computation complexity of optimization approaches using linear programming increases significantly as the size of the network increases. In this paper, we use deep reinforcement learning to derive a data-driven algorithm that does the path selection in the network considering the overhead of route computation and path updates. Our proposed scheme leverages information about past network behavior to identify a set of robust paths to be used for multiple future time intervals to avoid the overhead of updating the forwarding behavior of routers frequently. We compare the results of our approach to other traffic engineering solutions through extensive simulations across real network topologies. Our results demonstrate that our scheme fares well by a factor of 40% with respect to reducing link utilization compared to traditional TE schemes such as ECMP. Our scheme provides a slightly higher link utilization (around 25%) compared to schemes that only minimize link utilization and do not care about path updating overhead.

[1]  H. J. Chao,et al.  RL-AFEC: adaptive forward error correction for real-time video communication based on reinforcement learning , 2022, MMSys.

[2]  A. Cabellos-Aparicio,et al.  ENERO: Efficient real-time WAN routing optimization with Deep Reinforcement Learning , 2021, Comput. Networks.

[3]  Satyajeet Singh Ahuja,et al.  Network planning with deep reinforcement learning , 2021, SIGCOMM.

[4]  Haipeng Dai,et al.  Online Joint Optimization on Traffic Engineering and Network Update in Software-defined WANs , 2021, IEEE INFOCOM 2021 - IEEE Conference on Computer Communications.

[5]  Zehua Guo,et al.  A Scalable Deep Reinforcement Learning Approach for Traffic Engineering Based on Link Control , 2021, IEEE Communications Letters.

[6]  Behnaz Arzani,et al.  Contracting Wide-area Network Topologies to Solve Flow Problems Quickly , 2020, NSDI.

[7]  Mingwei Xu,et al.  A Multi-agent Reinforcement Learning Perspective on Distributed Traffic Engineering , 2020, 2020 IEEE 28th International Conference on Network Protocols (ICNP).

[8]  Junjie Zhang,et al.  CFR-RL: Traffic Engineering With Reinforcement Learning in SDN , 2020, IEEE Journal on Selected Areas in Communications.

[9]  Michael Schapira,et al.  TEAVAR: striking the right utilization-availability balance in WAN traffic engineering , 2019, SIGCOMM.

[10]  Hongzi Mao,et al.  Learning scheduling algorithms for data processing clusters , 2018, SIGCOMM.

[11]  Feng Liu,et al.  AuTO: scaling deep reinforcement learning for datacenter-scale automatic traffic optimization , 2018, SIGCOMM.

[12]  Chi Harold Liu,et al.  Experience-driven Networking: A Deep Reinforcement Learning based Approach , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[13]  Dafna Shahaf,et al.  Learning to Route , 2017, HotNets.

[14]  Marc Peter Deisenroth,et al.  Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[15]  Hongzi Mao,et al.  Neural Adaptive Video Streaming with Pensieve , 2017, SIGCOMM.

[16]  Mukul R. Prasad,et al.  Delta-net: Real-time Network Verification Using Atoms , 2017, NSDI.

[17]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[18]  Russell J. Clark,et al.  Kinetic: Verifiable Dynamic Network Control , 2015, NSDI.

[19]  Xin Jin,et al.  Dynamic scheduling of network updates , 2014, SIGCOMM.

[20]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[21]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[22]  Brighten Godfrey,et al.  VeriFlow: verifying network-wide invariants in real time , 2012, HotSDN '12.

[23]  David Walker,et al.  Abstractions for network update , 2012, SIGCOMM '12.

[24]  Ming Zhang,et al.  The Case for Fine-Grained Traffic Engineering in Data Centers , 2010, INM/WREN.

[25]  Harald Räcke,et al.  Optimal hierarchical decompositions for congestion minimization in networks , 2008, STOC.

[26]  Mohammad Taghi Hajiaghayi,et al.  Semi-oblivious routing: lower bounds , 2007, SODA '07.

[27]  Harald Räcke,et al.  Minimizing Congestion in General Networks , 2002, FOCS.

[28]  Leonid Peshkin,et al.  Reinforcement learning for adaptive routing , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[29]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[30]  Michael L. Littman,et al.  Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach , 1993, NIPS.

[31]  Hongzi Mao,et al.  Towards Safe Online Reinforcement Learning in Computer Systems , 2019 .

[32]  Robert Soulé,et al.  Semi-Oblivious Traffic Engineering: The Road Not Taken , 2018, NSDI.