论文信息 - ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games - 字舞流文

ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games

In this paper, we propose ELF, an Extensive, Lightweight and Flexible platform for fundamental reinforcement learning research. Using ELF, we implement a highly customizable real-time strategy (RTS) engine with three game environments (Mini-RTS, Capture the Flag and Tower Defense). Mini-RTS, as a miniature version of StarCraft, captures key game dynamics and runs at 40K frame-per-second (FPS) per core on a Macbook Pro notebook. When coupled with modern reinforcement learning methods, the system can train a full-game bot against built-in AIs end-to-end in one day with 6 CPUs and 1 GPU. In addition, our platform is flexible in terms of environment-agent communication topologies, choices of RL methods, changes in game parameters, and can host existing C/C++-based game environments like Arcade Learning Environment. Using ELF, we thoroughly explore training parameters and show that a network with Leaky ReLU and Batch Normalization coupled with long-horizon training and progressive curriculum beats the rule-based built-in AI more than $70\%$ of the time in the full game of Mini-RTS. Strong performance is also achieved on the other two games. In game replays, we show our agents learn interesting strategies. ELF, along with its RL platform, is open-sourced at this https URL.

Yuandong Tian | C. Lawrence Zitnick | Yuxin Wu | Wenling Shang | Qucheng Gong | C. L. Zitnick | Yuandong Tian | Yuxin Wu | Qucheng Gong | Wenling Shang

[1] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[2] ON THE DEVELOPMENT OF A FREE RTS GAME ENGINE , 2005 .

[3] H. Jaap van den Herik,et al. Parallel Monte-Carlo Tree Search , 2008, Computers and Games.

[4] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[5] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[6] Santiago Ontañón,et al. The Combinatorial Multi-Armed Bandit Problem and Its Application to Real-Time Strategy Games , 2013, AIIDE.

[7] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[8] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.

[9] Rob Fergus,et al. MazeBase: A Sandbox for Learning from Games , 2015, ArXiv.

[10] Shane Legg,et al. Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.

[11] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[12] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.

[13] Yuandong Tian,et al. Better Computer Go Player with Neural Network and Long-term Prediction , 2016, ICLR.

[14] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[15] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[16] Wojciech Jaskowski,et al. ViZDoom: A Doom-based AI research platform for visual reinforcement learning , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).

[17] Katja Hofmann,et al. The Malmo Platform for Artificial Intelligence Experimentation , 2016, IJCAI.

[18] Florian Richoux,et al. TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games , 2016, ArXiv.

[19] Nicolas Usunier,et al. Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks , 2016, ArXiv.

[20] Stephen Tyree,et al. Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU , 2016, ICLR.

[21] Jun Wang,et al. Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games , 2017, ArXiv.

[22] Yuandong Tian,et al. Training Agent for First-Person Shooter Game with Actor-Critic Curriculum Learning , 2016, ICLR.

[23] Razvan Pascanu,et al. Learning to Navigate in Complex Environments , 2016, ICLR.

[24] Shai Rozenberg,et al. Playing SNES in the Retro Learning Environment , 2016, ICLR.

[25] Guillaume Lample,et al. Playing FPS Games with Deep Reinforcement Learning , 2016, AAAI.