论文信息 - Grounding Language for Transfer in Deep Reinforcement Learning

Grounding Language for Transfer in Deep Reinforcement Learning

In this paper, we explore the utilization of natural language to drive transfer for reinforcement learning (RL). Despite the wide-spread application of deep RL techniques, learning generalized policy representations that work across domains remains a challenging problem. We demonstrate that textual descriptions of environments provide a compact intermediate channel to facilitate effective policy transfer. Specifically, by learning to ground the meaning of text to the dynamics of the environment such as transitions and rewards, an autonomous agent can effectively bootstrap policy learning on a new domain given its description. We employ a model-based RL approach consisting of a differentiable planning module, a model-free component and a factorized state representation to effectively use entity descriptions. Our model outperforms prior work on both transfer and multi-task scenarios in a variety of different environments. For instance, we achieve up to 14% and 11.5% absolute improvement over previously existing models in terms of average and initial rewards, respectively.

[1] Dileep George,et al. Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics , 2017, ICML.

[2] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[3] Wojciech Zaremba,et al. Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[4] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.

[5] Dan Klein,et al. Alignment-Based Compositional Semantics for Instruction Following , 2015, EMNLP.

[6] Noah A. Smith,et al. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2016, ACL 2016.

[7] Julian Togelius,et al. General Video Game AI: Competition, Challenges and Opportunities , 2016, AAAI.

[8] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.

[9] Peter Stone,et al. Transfer Learning via Inter-Task Mappings for Temporal Difference Learning , 2007, J. Mach. Learn. Res..

[10] Michael D. Buhrmester,et al. Amazon's Mechanical Turk , 2011, Perspectives on psychological science : a journal of the Association for Psychological Science.

[11] Matthew E. Taylor,et al. Initial Progress in Transfer for Deep Reinforcement Learning Algorithms , 2016 .

[12] Luke S. Zettlemoyer,et al. Weakly Supervised Learning of Semantic Parsers for Mapping Instructions to Actions , 2013, TACL.

[13] Philip H. S. Torr,et al. An embarrassingly simple approach to zero-shot learning , 2015, ICML.

[14] Regina Barzilay,et al. Learning to Win by Reading Manuals in a Monte-Carlo Framework , 2011, ACL.

[15] Sinno Jialin Pan,et al. Knowledge Transfer for Deep Reinforcement Learning with Hierarchical Experience Replay , 2017, AAAI.

[16] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[17] Luke S. Zettlemoyer,et al. Learning to Parse Natural Language Commands to a Robot Control System , 2012, ISER.

[18] Matthew R. Walter,et al. Learning spatial-semantic representations from natural language descriptions and scene classifications , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[19] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[20] Pieter Abbeel,et al. Value Iteration Networks , 2016, NIPS.

[21] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[22] Peter Stone,et al. Transferring Instances for Model-Based Reinforcement Learning , 2008, ECML/PKDD.