Rotation, Translation, and Cropping for Zero-Shot Generalization

Deep Reinforcement Learning (DRL) has shown impressive performance on domains with visual inputs, in particular various games. However, the agent is usually trained on a fixed environment, e.g. a fixed number of levels. A growing mass of evidence suggests that these trained models fail to generalize to even slight variations of the environments they were trained on. This paper advances the hypothesis that the lack of generalization is partly due to the input representation, and explores how rotation, cropping and translation could increase generality. We show that a cropped, translated and rotated observation can get better generalization on unseen levels of two-dimensional arcade games from the GVGAI framework. The generality of the agents is evaluated on both human-designed and procedurally generated levels.

[1]  R. Fergus,et al.  Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels , 2020, ICLR.

[2]  P. Abbeel,et al.  Reinforcement Learning with Augmented Data , 2020, NeurIPS.

[3]  Pieter Abbeel,et al.  CURL: Contrastive Unsupervised Representations for Reinforcement Learning , 2020, ICML.

[4]  Xingyou Song,et al.  Observational Overfitting in Reinforcement Learning , 2019, ICLR.

[5]  J. Schulman,et al.  Leveraging Procedural Generation to Benchmark Reinforcement Learning , 2019, ICML.

[6]  Jinwoo Shin,et al.  Network Randomization: A Simple Technique for Generalization in Deep Reinforcement Learning , 2019, ICLR.

[7]  Igor Mordatch,et al.  Emergent Tool Use From Multi-Agent Autocurricula , 2019, ICLR.

[8]  Julian Togelius,et al.  Deep Learning for Video Game Playing , 2017, IEEE Transactions on Games.

[9]  Sam Devlin,et al.  MazeExplorer: A Customisable 3D Benchmark for Assessing Generalisation in Reinforcement Learning , 2019, 2019 IEEE Conference on Games (CoG).

[10]  Wojciech M. Czarnecki,et al.  Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.

[11]  Sam Devlin,et al.  Generalization in Reinforcement Learning with Selective Noise Injection and Information Bottleneck , 2019, NeurIPS.

[12]  Julian Togelius,et al.  Obstacle Tower: A Generalization Challenge in Vision, Control, and Planning , 2019, IJCAI.

[13]  Taehoon Kim,et al.  Quantifying Generalization in Reinforcement Learning , 2018, ICML.

[14]  Marek Wydmuch,et al.  ViZDoom Competitions: Playing Doom From Pixels , 2018, IEEE Transactions on Games.

[15]  Julian Togelius,et al.  Playing Atari with Six Neurons , 2018, AAMAS.

[16]  Joelle Pineau,et al.  Natural Environment Benchmarks for Reinforcement Learning , 2018, ArXiv.

[17]  Jürgen Schmidhuber,et al.  Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.

[18]  Julian Togelius,et al.  Illuminating Generalization in Deep Reinforcement Learning through Procedural Level Generation , 2018, 1806.10729.

[19]  Julian Togelius,et al.  Deep Reinforcement Learning for General Video Game AI , 2018, 2018 IEEE Conference on Computational Intelligence and Games (CIG).

[20]  Shane Legg,et al.  IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.

[21]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Marlos C. Machado,et al.  Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..

[23]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[24]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[25]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[26]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[27]  Servicio Geológico Colombiano Sgc Volume 4 , 2013, Journal of Diabetes Investigation.

[28]  M. Preuss,et al.  Computational intelligence and games , 2013, GECCO.

[29]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[30]  Dieter Fox,et al.  Reinforcement learning for sensing strategies , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[31]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[32]  Michael Wooldridge,et al.  Autonomous agents and multi-agent systems , 2014 .

[33]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[34]  D. Broadbent Perception and communication , 1958 .