ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games

In this paper, we propose ELF, an Extensive, Lightweight and Flexible platform for fundamental reinforcement learning research. Using ELF, we implement a highly customizable real-time strategy (RTS) engine with three game environments (Mini-RTS, Capture the Flag and Tower Defense). Mini-RTS, as a miniature version of StarCraft, captures key game dynamics and runs at 40K frame-per-second (FPS) per core on a Macbook Pro notebook. When coupled with modern reinforcement learning methods, the system can train a full-game bot against built-in AIs end-to-end in one day with 6 CPUs and 1 GPU. In addition, our platform is flexible in terms of environment-agent communication topologies, choices of RL methods, changes in game parameters, and can host existing C/C++-based game environments like Arcade Learning Environment. Using ELF, we thoroughly explore training parameters and show that a network with Leaky ReLU and Batch Normalization coupled with long-horizon training and progressive curriculum beats the rule-based built-in AI more than $70\%$ of the time in the full game of Mini-RTS. Strong performance is also achieved on the other two games. In game replays, we show our agents learn interesting strategies. ELF, along with its RL platform, is open-sourced at this https URL.

[1]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[2]  ON THE DEVELOPMENT OF A FREE RTS GAME ENGINE , 2005 .

[3]  H. Jaap van den Herik,et al.  Parallel Monte-Carlo Tree Search , 2008, Computers and Games.

[4]  Simon M. Lucas,et al.  A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[5]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[6]  Santiago Ontañón,et al.  The Combinatorial Multi-Armed Bandit Problem and Its Application to Real-Time Strategy Games , 2013, AIIDE.

[7]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[8]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[9]  Rob Fergus,et al.  MazeBase: A Sandbox for Learning from Games , 2015, ArXiv.

[10]  Shane Legg,et al.  Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.

[11]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[12]  Marc G. Bellemare,et al.  The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..

[13]  Yuandong Tian,et al.  Better Computer Go Player with Neural Network and Long-term Prediction , 2016, ICLR.

[14]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[15]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[16]  Wojciech Jaskowski,et al.  ViZDoom: A Doom-based AI research platform for visual reinforcement learning , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).

[17]  Katja Hofmann,et al.  The Malmo Platform for Artificial Intelligence Experimentation , 2016, IJCAI.

[18]  Florian Richoux,et al.  TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games , 2016, ArXiv.

[19]  Nicolas Usunier,et al.  Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks , 2016, ArXiv.

[20]  Stephen Tyree,et al.  Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU , 2016, ICLR.

[21]  Jun Wang,et al.  Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games , 2017, ArXiv.

[22]  Yuandong Tian,et al.  Training Agent for First-Person Shooter Game with Actor-Critic Curriculum Learning , 2016, ICLR.

[23]  Razvan Pascanu,et al.  Learning to Navigate in Complex Environments , 2016, ICLR.

[24]  Shai Rozenberg,et al.  Playing SNES in the Retro Learning Environment , 2016, ICLR.

[25]  Guillaume Lample,et al.  Playing FPS Games with Deep Reinforcement Learning , 2016, AAAI.