论文信息 - Use All Your Skills, Not Only The Most Popular Ones

Use All Your Skills, Not Only The Most Popular Ones

Reinforcement Learning (RL) has shown promising results across various domains. However, applying it to develop gameplaying agents is challenging due to sparsity of extrinsic rewards, where agents get rewards from the environments only at the end of game levels. Previous works have shown that using intrinsic rewards is an effective way to deal with such cases. Intrinsic rewards allow to incorporate basic skills in agent policies to better generalize over various game levels. In a gameplay, it is common that certain actions (skills) are observed more often than others, which leads to a biased selection of actions. This problem boils down to a normalization issue in formulating the skill-based reward function. In this paper, we propose a novel solution to this problem by taking into account the frequency of all skills in the reward function. We show that our method improves the performance of agents by enabling them to select effective skills up to 2.5 times more frequently than that of the state-of-the-art in the context of the match-3 game Candy Crush Friends Saga.

[1] Aysu Betin Can,et al. Automated Video Game Testing Using Synthetic and Humanlike Agents , 2019, IEEE Transactions on Games.

[2] Stefan Freyr Gudmundsson,et al. Human-Like Playtesting with Deep Learning , 2018, 2018 IEEE Conference on Computational Intelligence and Games (CIG).

[3] Ahmad Beirami,et al. Towards Interactive Training of Non-Player Characters in Video Games , 2019, ArXiv.

[4] Sebastian Risi,et al. Automated Curriculum Learning by Rewarding Temporally Rare Events , 2018, 2018 IEEE Conference on Computational Intelligence and Games (CIG).

[5] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[6] Yuchul Shin,et al. Playtesting in Match 3 Game Using Strategic Plays via Reinforcement Learning , 2020, IEEE Access.

[7] Max Fischer. Using Reinforcement Learning for Games with Nondeterministic State Transitions , 2019 .

[8] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..

[9] Taehoon Kim,et al. Quantifying Generalization in Reinforcement Learning , 2018, ICML.

[10] Kazi A. Zaman,et al. Winning Isn ’ t Everything : Training Agents to Playtest Modern Games , 2019 .