Scaling Tangled Program Graphs to Visual Reinforcement Learning in ViZDoom

A tangled program graph framework (TPG) was recently proposed as an emergent process for decomposing tasks and simultaneously composing solutions by organizing code into graphs of teams of programs. The initial evaluation assessed the ability of TPG to discover agents capable of playing Atari game titles under the Arcade Learning Environment. This is an example of ‘visual’ reinforcement learning, i.e. agents are evolved directly from the frame buffer without recourse to hand designed features. TPG was able to evolve solutions competitive with state-of-the-art deep reinforcement learning solutions, but at a fraction of the complexity. One simplification assumed was that the visual input could be down sampled from a \(210 \times 160\) resolution to \(42 \times 32\). In this work, we consider the challenging 3D first person shooter environment of ViZDoom and require that agents be evolved at the original visual resolution of \(320 \times 240\) pixels. In addition, we address issues for developing agents capable of operating in multi-task ViZDoom environments simultaneously. The resulting TPG solutions retain all the emergent properties of the original work as well as the computational efficiency. Moreover, solutions appear to generalize across multiple task scenarios, whereas equivalent solutions from deep reinforcement learning have focused on single task scenarios alone.

[1]  Marc Ebner,et al.  Evolving Game State Features from Raw Pixels , 2017, EuroGP.

[2]  Marc G. Bellemare,et al.  The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..

[3]  Risto Miikkulainen,et al.  A Neuroevolution Approach to General Atari Game Playing , 2014, IEEE Transactions on Computational Intelligence and AI in Games.

[4]  Peter Lichodzijewski,et al.  A Symbiotic Bid-Based Framework for Problem Decomposition using Genetic Programming , 2011 .

[5]  Malcolm I. Heywood,et al.  Symbiosis, complexification and simplicity under GP , 2010, GECCO '10.

[6]  Yuandong Tian,et al.  Training Agent for First-Person Shooter Game with Actor-Critic Curriculum Learning , 2016, ICLR.

[7]  Elliot Meyerson,et al.  Reuse of Neural Modules for General Video Game Playing , 2015, AAAI.

[8]  Guillaume Lample,et al.  Playing FPS Games with Deep Reinforcement Learning , 2016, AAAI.

[9]  Sebastian Risi,et al.  DLNE: A hybridization of deep learning and neuroevolution for visual control , 2017, 2017 IEEE Conference on Computational Intelligence and Games (CIG).

[10]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[11]  Julian Togelius,et al.  Autoencoder-augmented neuroevolution for visual doom playing , 2017, 2017 IEEE Conference on Computational Intelligence and Games (CIG).

[12]  Julian Togelius,et al.  The 2009 Simulated Car Racing Championship , 2010, IEEE Transactions on Computational Intelligence and AI in Games.

[13]  Malcolm I. Heywood,et al.  Multi-task learning in Atari video games with emergent tangled program graphs , 2017, GECCO.

[14]  Risto Miikkulainen,et al.  Evolving Keepaway Soccer Players through Task Decomposition , 2003, GECCO.

[15]  Honglak Lee,et al.  Deep Learning for Reward Design to Improve Monte Carlo Tree Search in ATARI Games , 2016, IJCAI.

[16]  Elliot Meyerson,et al.  On the Cross-Domain Reusability of Neural Modules for General Video Game Playing , 2015, CGW/GIGA@IJCAI.

[17]  Wojciech Jaskowski,et al.  ViZDoom: A Doom-based AI research platform for visual reinforcement learning , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).

[18]  Simon M. Lucas,et al.  General Video Game AI: Learning from screen capture , 2017, 2017 IEEE Congress on Evolutionary Computation (CEC).

[19]  Malcolm I. Heywood,et al.  Emergent Tangled Graph Representations for Atari Game Playing Agents , 2017, EuroGP.

[20]  L. Citi,et al.  Clyde: A Deep Reinforcement Learning DOOM Playing Agent , 2017, AAAI Workshops.

[21]  Malcolm I. Heywood,et al.  Knowledge Transfer from Keepaway Soccer to Half-field Offense through Program Symbiosis: Building Simple Programs for a Complex Task , 2015, GECCO.

[22]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[23]  Risto Miikkulainen,et al.  Evolving Soccer Keepaway Players Through Task Decomposition , 2005, Machine Learning.

[24]  Julian Togelius,et al.  A Panorama of Artificial and Computational Intelligence in Games , 2015, IEEE Transactions on Computational Intelligence and AI in Games.