Ants and Reinforcement Learning: A Case Study in Routing in Dynamic Networks

We investigate two new distributed routing algorithms for data networks based on simple biological "ants" that explore the network and rapidly learn good routes, using a novel variation of reinforcement learning. These two algorithms are fully adaptive to topology changes and changes in link costs in the network, and have space and computational overheads that are competitive with traditional packet routing algorithms: although they can generate more routing traffic when the rate of failures in a network is low, they perform much better under higher failure rates. Both algorithms are more resilient than traditional algorithms, in the sense that random corruption of routing state has limited impact on the computation of paths. We present convergence theorems for both of our algorithms drawing on the theory of non-stationary and stationary discrete-time Markov chains over the reals. We present an extensive empirical evaluation of our algorithms on a simulator that is widely used in the computer networks community for validating and testing protocols. We present comparative results on data delivery performance, aggregate routing traffic (algorithm overhead), as well as the degree of resilience for our new algorithms and two traditional routing algorithms in current use. We also show that the performance of our algorithms scale well with increase in network size-using a realistic topology.