论文信息 - Application of Deep-RL with Sample-Efficient Method in Mini-games of StarCraft II

Application of Deep-RL with Sample-Efficient Method in Mini-games of StarCraft II

Recently, a key challenge of deep reinforcement learning (Deep-RL) is to handle a large amount of samples and learning time in domains with large state and action space. To remedy these problems, we focus on improving the sample efficiency of Deep-RL. We incorporate SIL into the state-of-the-art algorithm IMPALA in learning mini-games of StarCraft II, which has been a challenge to Deep-RL. Our results show that our agents achieve better performance with faster and more stable learning than those trained by the plain IMPALA on two mini-games of StarCraft II.

Tomoyuki Kaneko | Zhejie Hu

[1] Razvan Pascanu,et al. Deep reinforcement learning with relational inductive biases , 2018, ICLR.

[2] Nando de Freitas,et al. Playing hard exploration games by watching YouTube , 2018, NeurIPS.

[3] Satinder Singh,et al. Self-Imitation Learning , 2018, ICML.

[4] Shane Legg,et al. DeepMind Lab , 2016, ArXiv.

[5] Tom Schaul,et al. StarCraft II: A New Challenge for Reinforcement Learning , 2017, ArXiv.

[6] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.

[7] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[8] Nando de Freitas,et al. Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.

[9] Stephen Tyree,et al. Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU , 2016, ICLR.

[10] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..

[11] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.