Learning of communication codes in multi-agent reinforcement learning problem

Realization of cooperative behavior in multi-agent system is important for improving problem solving ability. Reinforcement learning is one of the learning methods for such cooperative behavior of agents. In this paper, we consider pursuit problem for multi-agent reinforcement learning with communication between the agents. In our study, the agents obtain communication codes through learning. Here, the codes are rules for communicating appropriate information under various situations. We call the learning of communication codes signal learning. The signal is expressed by bit sequence, and its length is set to be variable. We carried out experiment for performance comparison with varying the signal length from 0 to 4 bits. As a result, it has been shown that, in learning precision, the case of 1 bit or more bits communication outperformed the case of no communication. It also has been shown that 4 bits communication produced the best result among the five cases, while learning with longer signals required much more iterations.