Adaptive Two-stage Learning Algorithm for Repeated Games

In our society, people engage in a variety of interactions. To analyze such interactions, we consider these interactions as a game and people as agents equipped with reinforcement learning algorithms. Reinforcement learning algorithms are widely studied with a goal of identifying strategies of gaining large payoffs in games; however, existing algorithms learn slowly because they require a large number of interactions. In this work, we constructed an algorithm that both learns quickly and maximizes payoffs in various repeated games. Our proposed algorithm combines two different algorithms that are used in the early and later stages of our algorithm. We conducted experiments in which our proposed agents played ten kinds of games in self-play and with other agents. Results showed that our proposed algorithm learned more quickly than existing algorithms and gained sufficiently large payoffs in nine games.