Reinforcement Learning through Interaction among Multiple Agents

In ordinary reinforcement learning algorithms, a single agent learns to achieve a goal through many episodes. If a learning problem is complicated, it may take a much computation time to obtain the optimal policy. Meanwhile, for optimization problems, multi-agent search methods such as particle swarm optimization have been recognized that they are able to find rapidly a global optimal solution for multi-modal functions with wide solution space. This paper proposes a reinforcement learning algorithm by using multiple agents. In this algorithm, the multiple agents learn through not only their respective experiences but also interaction among them. For the interaction methods this paper proposes three strategies: the best action-value strategy, the average action-value strategy and the particle swarm strategy

[1]  Kwang Soon Lee,et al.  Successive Linearization-based Repetitive Control of Simulated Moving Bed Process , 2006, 2006 SICE-ICASE International Joint Conference.

[2]  K.S. Lee,et al.  Model Predictive Control of Condensate Recycle Process in a Cogeneration Power Station , 2007, 2007 American Control Conference.

[3]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.