Extended Replicator Dynamics as a Key to Reinforcement Learning in Multi-agent Systems

Modeling learning agents in the context of Multi-agent Systems requires an adequate understanding of their dynamic behaviour. Evolutionary Game Theory provides a dynamics which describes how strategies evolve over time. Borgers et al. [1] and Tuyls et al. [11] have shown how classical Reinforcement Learning (RL) techniques such as Cross-learning and Q-learning relate to the Replicator Dynamics (RD). This provides a better understanding of the learning process. In this paper, we introduce an extension of the Replicator Dynamics from Evolutionary Game Theory. Based on this new dynamics, a Reinforcement Learning algorithm is developed that attains a stable Nash equilibrium for all types of games. Such an algorithm is lacking for the moment. This kind of dynamics opens an interesting perspective for introducing new Reinforcement Learning algorithms in multi-state games and Multi-Agent Systems.