论文信息 - Relational Deep Reinforcement Learning - 字舞流文

Relational Deep Reinforcement Learning

We introduce an approach for deep reinforcement learning (RL) that improves upon the efficiency, generalization capacity, and interpretability of conventional approaches through structured perception and relational reasoning. It uses self-attention to iteratively reason about the relations between entities in a scene and to guide a model-free policy. Our results show that in a novel navigation and planning task called Box-World, our agent finds interpretable solutions that improve upon baselines in terms of sample complexity, ability to generalize to more complex scenes than experienced during training, and overall performance. In the StarCraft II Learning Environment, our agent achieves state-of-the-art performance on six mini-games -- surpassing human grandmaster performance on four. By considering architectural inductive biases, our work opens new directions for overcoming important, but stubborn, challenges in deep RL.

Razvan Pascanu | Murray Shanahan | Yujia Li | Oriol Vinyals | Karl Tuyls | Matthew Botvinick | Victor Bapst | Timothy P. Lillicrap | Adam Santoro | David P. Reichert | Peter W. Battaglia | David Raposo | Victoria Langston | Edward Lockhart | Vinícius Flores Zambaldi | Igor Babuschkin | Oriol Vinyals | I. Babuschkin | T. Lillicrap | P. Battaglia | V. Bapst | V. Zambaldi | David Raposo | Adam Santoro | Victoria Langston | M. Botvinick | Yujia Li | Razvan Pascanu | K. Tuyls | Edward Lockhart | M. Shanahan | D. Raposo

[1] Luc De Raedt,et al. Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[2] Kurt Driessens,et al. Relational Instance Based Regression for Relational Reinforcement Learning , 2003, ICML.

[3] Saso Dzeroski,et al. Integrating Guidance into Relational Reinforcement Learning , 2004, Machine Learning.

[4] Ah Chung Tsoi,et al. The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[5] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[6] Murray Shanahan,et al. Towards Deep Symbolic Reinforcement Learning , 2016, ArXiv.

[7] Joshua B. Tenenbaum,et al. Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[8] Mathias Niepert,et al. Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[9] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[10] Razvan Pascanu,et al. Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[11] Razvan Pascanu,et al. A simple neural network module for relational reasoning , 2017, NIPS.

[12] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.

[13] Razvan Pascanu,et al. Metacontrol for Adaptive Imagination-Based Optimization , 2017, ICLR.

[14] Razvan Pascanu,et al. Discovering objects and their relations from entangled scene representations , 2017, ICLR.

[15] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[16] Max Welling,et al. Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[17] Razvan Pascanu,et al. Visual Interaction Networks: Learning a Physics Simulator from Video , 2017, NIPS.

[18] Razvan Pascanu,et al. Sim-to-Real Robot Learning from Pixels with Progressive Nets , 2016, CoRL.

[19] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.

[20] Tom Schaul,et al. StarCraft II: A New Challenge for Reinforcement Learning , 2017, ArXiv.

[21] Dileep George,et al. Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics , 2017, ICML.

[22] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.

[23] Razvan Pascanu,et al. Learning model-based planning from scratch , 2017, ArXiv.

[24] Samy Bengio,et al. Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.

[25] Le Song,et al. 2 Common Formulation for Greedy Algorithms on Graphs , 2018 .

[26] Max Welling,et al. Attention Solves Your TSP , 2018, ArXiv.

[27] Samy Bengio,et al. A Study on Overfitting in Deep Reinforcement Learning , 2018, ArXiv.

[28] Xinlei Chen,et al. Iterative Visual Reasoning Beyond Convolutions , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29] Yichen Wei,et al. Relation Networks for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[31] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.

[32] Rémi Munos,et al. Learning to Search with MCTSnets , 2018, ICML.

[33] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34] De,et al. Relational Reinforcement Learning , 2022 .