论文信息 - Hierarchical Reinforcement Learning for Multi-agent MOBA Game

Hierarchical Reinforcement Learning for Multi-agent MOBA Game

Real Time Strategy (RTS) games require macro strategies as well as micro strategies to obtain satisfactory performance since it has large state space, action space, and hidden information. This paper presents a novel hierarchical reinforcement learning model for mastering Multiplayer Online Battle Arena (MOBA) games, a sub-genre of RTS games. The novelty of this work are: (1) proposing a hierarchical framework, where agents execute macro strategies by imitation learning and carry out micromanipulations through reinforcement learning, (2) developing a simple self-learning method to get better sample efficiency for training, and (3) designing a dense reward function for multi-agent cooperation in the absence of game engine or Application Programming Interface (API). Finally, various experiments have been performed to validate the superior performance of the proposed method over other state-of-the-art reinforcement learning algorithms. Agent successfully learns to combat and defeat bronze-level built-in AI with 100% win rate, and experiments show that our method can create a competitive multi-agent for a kind of mobile MOBA game {\it King of Glory} in 5v5 mode.

[1] Bo Li,et al. TStarBots: Defeating the Cheating Level Builtin AI in StarCraft II in the Full Game , 2018, ArXiv.

[2] Zongqing Lu,et al. Learning Attentional Communication for Multi-Agent Cooperation , 2018, NeurIPS.

[3] Shimon Whiteson,et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.

[4] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[5] Pieter Abbeel,et al. Meta Learning Shared Hierarchies , 2017, ICLR.

[6] Liang Wang,et al. Hierarchical Macro Strategy Model for MOBA Game AI , 2018, AAAI.

[7] Guangwen Yang,et al. Episodic Memory Deep Q-Networks , 2018, IJCAI.

[8] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.

[9] Rob Fergus,et al. Learning Multiagent Communication with Backpropagation , 2016, NIPS.

[10] Satinder Singh,et al. Self-Imitation Learning , 2018, ICML.

[11] Tarek M. Sobh. Discrete Event Dynamic Systems: An Overview , 1991 .

[12] Shimon Whiteson,et al. Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning , 2017, ICML.

[13] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.

[14] Santiago Ontañón,et al. A Survey of Real-Time Strategy Game AI Research and Competition in StarCraft , 2013, IEEE Transactions on Computational Intelligence and AI in Games.

[15] Dongbin Zhao,et al. StarCraft Micromanagement With Reinforcement Learning and Curriculum Transfer Learning , 2018, IEEE Transactions on Emerging Topics in Computational Intelligence.

[16] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[17] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.

[18] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.