High-Level Strategy Selection under Partial Observability in StarCraft: Brood War

We consider the problem of high-level strategy selection in the adversarial setting of real-time strategy games from a reinforcement learning perspective, where taking an action corresponds to switching to the respective strategy. Here, a good strategy successfully counters the opponent's current and possible future strategies which can only be estimated using partial observations. We investigate whether we can utilize the full game state information during training time (in the form of an auxiliary prediction task) to increase performance. Experiments carried out within a StarCraft: Brood War bot against strong community bots show substantial win rate improvements over a fixed-strategy baseline and encouraging results when learning with the auxiliary task.

[1]  Tom Schaul,et al.  StarCraft II: A New Challenge for Reinforcement Learning , 2017, ArXiv.

[2]  Nicolas Usunier,et al.  Forward Modeling for Partial Observation Strategy Games - A StarCraft Defogger , 2018, NeurIPS.

[3]  Santiago Ontañón,et al.  A Survey of Real-Time Strategy Game AI Research and Competition in StarCraft , 2013, IEEE Transactions on Computational Intelligence and AI in Games.

[4]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[5]  Sebastian Risi,et al.  Learning macromanagement in starcraft from replays using deep learning , 2017, 2017 IEEE Conference on Computational Intelligence and Games (CIG).

[6]  Guillaume Lample,et al.  Playing FPS Games with Deep Reinforcement Learning , 2016, AAAI.

[7]  Kibok Lee,et al.  Augmenting Supervised Neural Networks with Unsupervised Objectives for Large-scale Image Classification , 2016, ICML.

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  Razvan Pascanu,et al.  Learning to Navigate in Complex Environments , 2016, ICLR.

[10]  Vladlen Koltun,et al.  Learning to Act by Predicting the Future , 2016, ICLR.

[11]  Gabriel Synnaeve,et al.  Programmation et apprentissage bayésien pour les jeux vidéo multi-joueurs, application à l'intelligence artificielle de jeux de stratégies temps-réel. (Bayesian Programming and Learning for Multi-Player Video Games, Application to RTS AI) , 2012 .