论文信息 - Routing in optical transport networks with deep reinforcement learning - 字舞流文

Routing in optical transport networks with deep reinforcement learning

Deep reinforcement learning (DRL) has recently revolutionized the resolution of decision-making and automated control problems. In the context of networking, there is a growing trend in the research community to apply DRL algorithms to optimization problems such as routing. However, existing proposals fail to achieve good results, often under-performing traditional routing techniques. We argue that the reason behind this poor performance is that they use straightforward representations of networks. In this paper, we propose a DRL-based solution for routing in optical transport networks (OTNs). Contrary to previous works, we propose a more elaborate representation of the network state that reduces the level of knowledge abstraction required for DRL agents and easily captures the singularities of network topologies. Our evaluation results show that using our novel representation, DRL agents achieve better performance and learn how to route traffic in OTNs significantly faster compared to state-of-the-art representations. Additionally, we reverse engineered the routing strategy learned by our DRL agent, and as a result, we found a routing algorithm that outperforms well-known traditional routing heuristics.

Albert Cabellos-Aparicio | Pere Barlet-Ros | Albert Mestres | Junlin Yu | Li Kuang | Haoyu Feng | Jose Suarez-Varela | A. Cabellos-Aparicio | J. Suárez-Varela | P. Barlet-Ros | Albert Mestres | Li Kuang | Junlin Yu | Haoyu Feng

[1] Nando de Freitas,et al. Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.

[2] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.

[3] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[4] Dafna Shahaf,et al. Learning to Route , 2017, HotNets.

[5] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.

[6] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.

[7] Jawad A. Salehi,et al. A Quality-of-Transmission Aware Dynamic Routing and Spectrum Assignment Scheme for Future Elastic Optical Networks , 2013, Journal of Lightwave Technology.

[8] Albert Cabellos-Aparicio,et al. Feature Engineering for Deep Reinforcement Learning Based Routing , 2019, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[9] Ian F. Akyildiz,et al. QoS-Aware Adaptive Routing in Multi-layer Hierarchical Software Defined Networks: A Reinforcement Learning Approach , 2016, 2016 IEEE International Conference on Services Computing (SCC).

[10] Liang Guo,et al. The war between mice and elephants , 2001, Proceedings Ninth International Conference on Network Protocols. ICNP 2001.

[11] Jim Esch,et al. Software-Defined Networking: A Comprehensive Survey , 2015, Proc. IEEE.

[12] Christophe Diot,et al. Traffic matrix estimation: existing techniques and new directions , 2002, SIGCOMM 2002.

[13] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[14] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.

[16] Jean C. Walrand,et al. Knowledge-Defined Networking: Modelització de la xarxa a través de l’aprenentatge automàtic i la inferència , 2016 .

[17] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[18] Joao Santos,et al. Performance evaluation of integrated OTN/DWDM networks with single-stage multiplexing of optical channel data units , 2011, 2011 13th International Conference on Transparent Optical Networks.

[19] Roberto Proietti,et al. Deep-RMSA: A Deep-Reinforcement-Learning Routing, Modulation and Spectrum Assignment Agent for Elastic Optical Networks , 2018, 2018 Optical Fiber Communications Conference and Exposition (OFC).

[20] Dale Schuurmans,et al. Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.

[21] Nei Kato,et al. Routing or Computing? The Paradigm Shift Towards Intelligent Computer Network Packet Transmission Based on Deep Learning , 2017, IEEE Transactions on Computers.