Evaluating Reinforcement Learning Methods for Bundle Routing Control

Cognitive networking applications continuously adapt actions according to observations of the environment and assigned performance goals. In this paper, one such cognitive networking application is evaluated where the aim is to route bundles over parallel links of different characteristics. Several machine learning algorithms may be suitable for the task. This research tested different reinforcement learning methods as potential enablers for this application: Q-Routing, Double Q-Learning, an actor-critic Learning Automata implementing the S-model, and the Cognitive Network Controller (CNC), which uses on a spiking neural network for Q-value prediction. All cases are evaluated under the same experimental conditions. Working with either a stable or time-varying environment with respect to the quality of the links, each routing method was evaluated with an identical number of bundle transmissions generated at a common rate. The measurements indicate that in general, the Cognitive Network Controller (CNC) produces better performance than the other methods followed by the Learning Automata. In the presented tests, the performance of Q-Routing and Double Q-Learning achieved similar performance to a non-learning round-robin approach. It is expect that these results will help to guide and improve the design of this and future cognitive networking applications.

[1]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[2]  Ricardo Lent Resource Selection in Cognitive Networks With Spiking Neural Networks , 2018, IEEE Transactions on Cognitive Communications and Networking.

[3]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[4]  Michael L. Littman,et al.  Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach , 1993, NIPS.

[5]  Rachel Dudukovich,et al.  A machine learning concept for DTN routing , 2017, 2017 IEEE International Conference on Wireless for Space and Extreme Environments (WiSEE).

[6]  Kumpati S. Narendra,et al.  Learning automata - an introduction , 1989 .