Actor-Critic-Based Resource Allocation for Multi-Modal Optical Networks

With the rapid development of optical network, network status appears more and more features. For example, flexi-grid technology introduces noticeable features about spectral constraints and deep features about spectral fragmentation. However, limited by complex non-linear relationships among different features and optimization objectives, traditional heuristic algorithms for resource allocation cannot discover and utilize proper combination of these features sometimes. Reinforcement learning (RL) is an autonomic learning technology that could dig out essential features automatically for network optimization with different objectives. In this paper, we introduce the concept of multimodal optical networks to represent different features of optical networks, and propose actor-critic-based resource allocation (ACRA) algorithm to improve the performance of resource allocation in optical networks. Simulation results show that multi-modal representation method can accelerate the learning efficiency, and the proposed ACRA algorithm can achieve the optimization of resource allocation.

[1]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[2]  Pierre Vandergheynst,et al.  Geometric Deep Learning: Going beyond Euclidean data , 2016, IEEE Signal Process. Mag..

[3]  Debasish Datta,et al.  Impact of transmission impairments on the teletraffic performance of wavelength-routed optical networks , 1999 .

[4]  V. Curri,et al.  Considering transmission impairments in wavelength routed networks , 2005, Conference onOptical Network Design and Modeling, 2005..

[5]  Patrick M. Pilarski,et al.  Model-Free reinforcement learning with continuous action in practice , 2012, 2012 American Control Conference (ACC).

[6]  Ioannis Tomkos,et al.  Impairment Aware Based Routing and Wavelength Assignment in Transparent Long Haul Networks , 2007, ONDM.

[7]  Kostas Katrinis,et al.  ICBR-Diff: an Impairment Constraint Based Routing Strategy with Quality of Signal Differentiation , 2010, J. Networks.

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[10]  John N. Tsitsiklis,et al.  Asynchronous stochastic approximation and Q-learning , 1993, Proceedings of 32nd IEEE Conference on Decision and Control.

[11]  Suresh Subramaniam,et al.  QoT-aware RWA algorithms for Fast Failure Recovery in All-Optical Networks , 2008, OFC/NFOEC 2008 - 2008 Conference on Optical Fiber Communication/National Fiber Optic Engineers Conference.

[12]  Maïté Brandt-Pearce,et al.  QoT-Aware Routing in Impairment-Constrained Optical Networks , 2007, IEEE GLOBECOM 2007 - IEEE Global Telecommunications Conference.

[13]  Pin-Han Ho State-of-the-art progress in developing survivable routing schemes in mesh WDM networks , 2004, IEEE Communications Surveys & Tutorials.

[14]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[15]  Zhili Zhou,et al.  Novel Survivable Logical Topology Routing by Logical Protecting Spanning Trees in IP-Over-WDM Networks , 2017, IEEE/ACM Transactions on Networking.

[16]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[17]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[18]  N. Zulkifli,et al.  Moving Towards Upgradeable All-Optical Networks through Impairment-aware RWA Algorithms , 2007, OFC/NFOEC 2007 - 2007 Conference on Optical Fiber Communication and the National Fiber Optic Engineers Conference.

[19]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[20]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.