论文信息 - Using Unity to Help Solve Intelligence - 字舞流文

Using Unity to Help Solve Intelligence

In the pursuit of artificial general intelligence, our most significant measurement of progress is an agent's ability to achieve goals in a wide range of environments. Existing platforms for constructing such environments are typically constrained by the technologies they are founded on, and are therefore only able to provide a subset of scenarios necessary to evaluate progress. To overcome these shortcomings, we present our use of Unity, a widely recognized and comprehensive game engine, to create more diverse, complex, virtual simulations. We describe the concepts and components developed to simplify the authoring of these environments, intended for use predominantly in the field of reinforcement learning. We also introduce a practical approach to packaging and re-distributing environments in a way that attempts to improve the robustness and reproducibility of experiment results. To illustrate the versatility of our use of Unity compared to other solutions, we highlight environments already created using our approach from published papers. We hope that others can draw inspiration from how we adapted Unity to our needs, and anticipate increasingly varied and complex environments to emerge from our approach as familiarity grows.

Simon Carter | Tom Ward | Jay Lemmon | Keith Anderson | Andrew Bolt | Seb Noury | Manuel Sanchez | Adrian Bolton | Piotr Trochim | Tom Handley | Nik Hemmings | Ricardo Barreira | Jonathan Coe | Keith Anderson | Jay Lemmon | A. Bolton | Tom Ward | Seb Noury | Manuel Sanchez | Simon Carter | Andrew Bolt | Ricardo Barreira | P. Trochim | Nik Hemmings | J. Coe | T. Handley

[1] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.

[2] Brian Tanner,et al. RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments , 2009, J. Mach. Learn. Res..

[3] Carl Boettiger,et al. An introduction to Docker for reproducible research , 2014, OPSR.

[4] Joel Z. Leibo,et al. Generalization of Reinforcement Learners with Working and Episodic Memory , 2019, NeurIPS.

[5] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.

[6] Katja Hofmann,et al. The Malmo Platform for Artificial Intelligence Experimentation , 2016, IJCAI.

[7] Roy Schwartz,et al. Show Your Work: Improved Reporting of Experimental Results , 2019, EMNLP.

[8] Marwan Mattar,et al. Unity: A General Platform for Intelligent Agents , 2018, ArXiv.

[9] Jessica B. Hamrick,et al. Structured agents for physical construction , 2019, ICML.

[10] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[11] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.

[12] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.

[13] Shane Legg,et al. Psychlab: A Psychology Laboratory for Deep Reinforcement Learning Agents , 2018, ArXiv.

[14] Wojciech Jaskowski,et al. ViZDoom: A Doom-based AI research platform for visual reinforcement learning , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).

[15] Tom Schaul,et al. StarCraft II: A New Challenge for Reinforcement Learning , 2017, ArXiv.

[16] Guy Lever,et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.

[17] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.

[18] Joelle Pineau,et al. Improving Reproducibility in Machine Learning Research (A Report from the NeurIPS 2019 Reproducibility Program) , 2020, J. Mach. Learn. Res..

[19] Murray Campbell,et al. Deep Blue , 2002, Artif. Intell..

[20] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[21] Razvan Pascanu,et al. Vector-based navigation using grid-like representations in artificial agents , 2018, Nature.

[22] Shane Legg,et al. Universal Intelligence: A Definition of Machine Intelligence , 2007, Minds and Machines.

[23] John Schulman,et al. Gotta Learn Fast: A New Benchmark for Generalization in RL , 2018, ArXiv.

[24] M. Baker. 1,500 scientists lift the lid on reproducibility , 2016, Nature.

[25] A. Miyake,et al. Models of Working Memory: Mechanisms of Active Maintenance and Executive Control , 1999 .

[26] Yuval Tassa,et al. DeepMind Control Suite , 2018, ArXiv.

[27] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[28] Shai Rozenberg,et al. Playing SNES in the Retro Learning Environment , 2016, ICLR.

[29] Aaron van den Oord,et al. Shaping Belief States with Generative Environment Models for RL , 2019, NeurIPS.

[30] Shane Legg,et al. Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.

[31] Richard Tanburn,et al. Making Efficient Use of Demonstrations to Solve Hard Exploration Problems , 2019, ICLR.