Obstacle Tower: A Generalization Challenge in Vision, Control, and Planning

The rapid pace of recent research in AI has been driven in part by the presence of fast and challenging simulation environments. These environments often take the form of games; with tasks ranging from simple board games, to competitive video games. We propose a new benchmark - Obstacle Tower: a high fidelity, 3D, 3rd person, procedurally generated environment. An agent playing Obstacle Tower must learn to solve both low-level control and high-level planning problems in tandem while learning from pixels and a sparse reward signal. Unlike other benchmarks such as the Arcade Learning Environment, evaluation of agent performance in Obstacle Tower is based on an agent's ability to perform well on unseen instances of the environment. In this paper we outline the environment and provide a set of baseline results produced by current state-of-the-art Deep RL methods as well as human players. These algorithms fail to produce agents capable of performing near human level.

[1]  Jitendra Malik,et al.  Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.

[2]  Tom Schaul,et al.  Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.

[3]  Alexei A. Efros,et al.  Large-Scale Study of Curiosity-Driven Learning , 2018, ICLR.

[4]  Katja Hofmann,et al.  The Malmo Platform for Artificial Intelligence Experimentation , 2016, IJCAI.

[5]  Nando de Freitas,et al.  Playing hard exploration games by watching YouTube , 2018, NeurIPS.

[6]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[7]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[8]  John Schulman,et al.  Gotta Learn Fast: A New Benchmark for Generalization in RL , 2018, ArXiv.

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  George Stiny,et al.  Shape Grammars and the Generative Specification of Painting and Sculpture , 1971, IFIP Congress.

[11]  Julian Togelius,et al.  Measuring Intelligence through Games , 2011, ArXiv.

[12]  R. J. Joenk,et al.  IBM journal of research and development: information for authors , 1978 .

[13]  F. Hsu,et al.  A Grandmaster Chess Machine , 1990 .

[14]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[15]  Shakir Mohamed,et al.  Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning , 2015, NIPS.

[16]  Julian Togelius,et al.  General Video Game AI: Competition, Challenges and Opportunities , 2016, AAAI.

[17]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[18]  Marwan Mattar,et al.  Unity: A General Platform for Intelligent Agents , 2018, ArXiv.

[19]  Zeb Kurth-Nelson,et al.  Learning to reinforcement learn , 2016, CogSci.

[20]  Julian Togelius,et al.  Illuminating Generalization in Deep Reinforcement Learning through Procedural Level Generation , 2018, 1806.10729.

[21]  Graham Kendall,et al.  Editorial: IEEE Transactions on Computational Intelligence and AI in Games , 2015, IEEE Trans. Comput. Intell. AI Games.

[22]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[23]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[24]  Wojciech Jaskowski,et al.  ViZDoom: A Doom-based AI research platform for visual reinforcement learning , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).

[25]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[27]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[28]  Philip Hingston,et al.  A Turing Test for Computer Game Bots , 2009, IEEE Transactions on Computational Intelligence and AI in Games.

[29]  Tom Schaul,et al.  FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.

[30]  Taehoon Kim,et al.  Quantifying Generalization in Reinforcement Learning , 2018, ICML.

[31]  Simon M. Lucas,et al.  Pac-Man Conquers Academia: Two Decades of Research Using a Classic Arcade Game , 2018, IEEE Transactions on Games.

[32]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[33]  Philip Bachman,et al.  Deep Reinforcement Learning that Matters , 2017, AAAI.

[34]  Marlos C. Machado,et al.  Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..

[35]  Jürgen Schmidhuber,et al.  Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.

[36]  Tom Schaul,et al.  Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.

[37]  Arthur L. Samuel,et al.  Some studies in machine learning using the game of checkers , 2000, IBM J. Res. Dev..

[38]  Julian Togelius,et al.  Pommerman: A Multi-Agent Playground , 2018, AIIDE Workshops.

[39]  Alexei A. Efros,et al.  Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[40]  Marc G. Bellemare,et al.  Dopamine: A Research Framework for Deep Reinforcement Learning , 2018, ArXiv.

[41]  Sandy H. Huang,et al.  Adversarial Attacks on Neural Network Policies , 2017, ICLR.

[42]  Ricardo Vilalta,et al.  A Perspective View and Survey of Meta-Learning , 2002, Artificial Intelligence Review.

[43]  Nolan Wagener,et al.  Learning contact-rich manipulation skills with guided policy search , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[44]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[45]  Amos J. Storkey,et al.  Exploration by Random Network Distillation , 2018, ICLR.

[46]  Jonathan Schaeffer,et al.  A World Championship Caliber Checkers Program , 1992, Artif. Intell..

[47]  Marc G. Bellemare,et al.  The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.

[48]  Joris Dormans,et al.  Adventures in level design: generating missions and spaces for action adventure games , 2010, PCGames@FDG.

[49]  Julian Togelius,et al.  The Mario AI Benchmark and Competitions , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[50]  Samy Bengio,et al.  A Study on Overfitting in Deep Reinforcement Learning , 2018, ArXiv.