Flatland: a Lightweight First-Person 2-D Environment for Reinforcement Learning

Flatlandis a simple, lightweight environment for fastprototyping and testing of reinforcement learning agents. It is oflower complexity compared to similar 3D platforms (e.g. Deep-Mind Lab or VizDoom), but emulates physical properties of thereal world, such as continuity, multi-modal partially-observablestates with first-person view and coherent physics. We proposeto use it as an intermediary benchmark for problems related toLifelong Learning.Flatlandis highly customizable and offers awide range of task difficulty to extensively evaluate the propertiesof artificial agents. We experiment with three reinforcementlearning baseline agents and show that they can rapidly solvea navigation task inFlatland. A video of an agent acting inFlatlandis available here: https://youtu.be/I5y6Y2ZypdA.

[1]  Jitendra Malik,et al.  Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.

[2]  Tom Schaul,et al.  Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.

[3]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[5]  Yee Whye Teh,et al.  Distral: Robust multitask reinforcement learning , 2017, NIPS.

[6]  Karl J. Friston The free-energy principle: a rough guide to the brain? , 2009, Trends in Cognitive Sciences.

[7]  P. J. Green,et al.  Probability and Statistical Inference , 1978 .

[8]  Vladlen Koltun,et al.  Learning to Act by Predicting the Future , 2016, ICLR.

[9]  Alex M. Andrew,et al.  Reinforcement Learning: : An Introduction , 1998 .

[10]  Jürgen Schmidhuber,et al.  World Models , 2018, ArXiv.

[11]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[12]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[13]  Wojciech Jaskowski,et al.  ViZDoom: A Doom-based AI research platform for visual reinforcement learning , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).

[14]  Joel Z. Leibo,et al.  Unsupervised Predictive Memory in a Goal-Directed Agent , 2018, ArXiv.

[15]  Katja Hofmann,et al.  The Malmo Platform for Artificial Intelligence Experimentation , 2016, IJCAI.

[16]  Longxin Lin Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching , 2004, Machine Learning.

[17]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[18]  Philip Bachman,et al.  Deep Reinforcement Learning that Matters , 2017, AAAI.

[19]  Marc G. Bellemare,et al.  The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..

[20]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.