Scaling Imitation Learning in Minecraft

Imitation learning is a powerful family of techniques for learning sensorimotor coordination in immersive environments. We apply imitation learning to attain state-of-the-art performance on hard exploration problems in the Minecraft environment. We report experiments that highlight the influence of network architecture, loss function, and data augmentation. An early version of our approach reached second place in the MineRL competition at NeurIPS 2019. Here we report stronger results that can be used as a starting point for future competition entries and related research. Our code is available at this https URL.

[1]  Katja Hofmann,et al.  The Malmo Platform for Artificial Intelligence Experimentation , 2016, IJCAI.

[2]  Zhao Chen,et al.  The Game Imitation: Deep Supervised Convolutional Networks for Quick Video Game AI , 2017, ArXiv.

[3]  Vladlen Koltun,et al.  Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning , 2020, ICML.

[4]  John Schulman,et al.  Teacher–Student Curriculum Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[5]  Tom Schaul,et al.  Deep Q-learning From Demonstrations , 2017, AAAI.

[6]  Guy Lever,et al.  Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.

[7]  Wojciech M. Czarnecki,et al.  Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.

[8]  Richard Socher,et al.  Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement Learning , 2017, ICLR.

[9]  Katja Hofmann,et al.  The MineRL Competition on Sample Efficient Reinforcement Learning using Human Priors , 2019, ArXiv.

[10]  Shane Legg,et al.  IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.

[11]  Ruslan Salakhutdinov,et al.  MineRL: A Large-Scale Dataset of Minecraft Demonstrations , 2019, IJCAI.

[12]  Matthieu Geist,et al.  Boosted Bellman Residual Minimization Handling Expert Demonstrations , 2014, ECML/PKDD.

[13]  Wojciech Jaskowski,et al.  ViZDoom: A Doom-based AI research platform for visual reinforcement learning , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).

[14]  Misha Denil,et al.  Deep Apprenticeship Learning for Playing Video Games , 2015, AAAI Workshop: Learning for General Competency in Video Games.

[15]  Sebastian Risi,et al.  Learning macromanagement in starcraft from replays using deep learning , 2017, 2017 IEEE Conference on Computational Intelligence and Games (CIG).

[16]  Brandon Houghton,et al.  Retrospective Analysis of the 2019 MineRL Competition on Sample Efficient Reinforcement Learning , 2019, Proceedings of Machine Learning Research.

[17]  Tom Schaul,et al.  Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.

[18]  Tengyu Ma,et al.  Fixup Initialization: Residual Learning Without Normalization , 2019, ICLR.

[19]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[20]  Demis Hassabis,et al.  Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.

[21]  Stefan Lee,et al.  Decentralized Distributed PPO: Solving PointGoal Navigation , 2019, ArXiv.