How to Motivate Your Dragon: Teaching Goal-Driven Agents to Speak and Act in Fantasy Worlds

We seek to create agents that both act and communicate with other agents in pursuit of a goal. Towards this end, we extend LIGHT (Urbanek et al. 2019)—a large-scale crowd-sourced fantasy text-game—with a dataset of quests. These contain natural language motivations paired with in-game goals and human demonstrations; completing a quest might require dialogue or actions (or both). We introduce a reinforcement learning system that (1) incorporates large-scale language modeling-based and commonsense reasoning-based pre-training to imbue the agent with relevant priors; and (2) leverages a factorized action space of action commands and dialogue, balancing between the two. We conduct zero-shot evaluations using held-out human expert demonstrations, showing that our agents are able to act consistently and talk naturally with respect to their motivations.

[1]  Jonathan May,et al.  Comprehensible Context-driven Text Game Playing , 2019, 2019 IEEE Conference on Games (CoG).

[2]  Samuel R. Bowman,et al.  Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks , 2018, ArXiv.

[3]  Shie Mannor,et al.  Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning , 2018, NeurIPS.

[4]  Matthew J. Hausknecht,et al.  TextWorld: A Learning Environment for Text-based Games , 2018, CGW@IJCAI.

[5]  Yejin Choi,et al.  ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning , 2019, AAAI.

[6]  Minlie Huang,et al.  A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation , 2020, TACL.

[7]  Hervé Frezza-Buet,et al.  Sample-efficient batch reinforcement learning for dialogue management optimization , 2011, TSLP.

[8]  R. A. Brooks,et al.  Intelligence without Representation , 1991, Artif. Intell..

[9]  Jing He,et al.  Policy Networks with Two-Stage Training for Dialogue Systems , 2016, SIGDIAL Conference.

[10]  Matthew J. Hausknecht,et al.  How to Avoid Being Eaten by a Grue: Structured Exploration Strategies for Textual Worlds , 2020, ArXiv.

[11]  Jason Weston,et al.  Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation , 2020, EMNLP.

[12]  Mark O. Riedl,et al.  Transfer in Deep Reinforcement Learning Using Knowledge Graphs , 2019, EMNLP.

[13]  Mary Williamson,et al.  Recipes for Building an Open-Domain Chatbot , 2020, EACL.

[14]  Andreas Oikonomou,et al.  A study of how different game play aspects can affect the popularity of role-playing video games , 2011, 2011 16th International Conference on Computer Games (CGAMES).

[15]  Stefan Lee,et al.  Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Matthew Henderson,et al.  The Second Dialog State Tracking Challenge , 2014, SIGDIAL Conference.

[17]  Romain Laroche,et al.  Learning Dynamic Knowledge Graphs to Generalize on Text-Based Games , 2020, ArXiv.

[18]  Qi Wu,et al.  Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Jeremy Blackburn,et al.  The Pushshift Reddit Dataset , 2020, ICWSM.

[20]  Romain Laroche,et al.  Counting to Explore and Generalize in Text-based Games , 2018, ArXiv.

[21]  Catherine Havasi,et al.  Representing General Relational Knowledge in ConceptNet 5 , 2012, LREC.

[22]  Stefanie Tellex,et al.  Toward understanding natural language directions , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[23]  Doug Downey,et al.  Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks , 2020, ACL.

[24]  Antoine Bordes,et al.  Training Millions of Personalized Dialogue Agents , 2018, EMNLP.

[25]  Kyunghyun Cho,et al.  Countering Language Drift via Visual Grounding , 2019, EMNLP.

[26]  Marilyn A. Walker,et al.  Reinforcement Learning for Spoken Dialogue Systems , 1999, NIPS.

[27]  Mike Lewis,et al.  Hierarchical Text Generation and Planning for Strategic Dialogue , 2017, ICML.

[28]  Benjamin Kuipers,et al.  Walk the Talk: Connecting Language, Knowledge, and Action in Route Instructions , 2006, AAAI.

[29]  Romain Laroche,et al.  Learning Dynamic Belief Graphs to Generalize on Text-Based Games , 2020, NeurIPS.

[30]  Matthew J. Hausknecht,et al.  Interactive Fiction Games: A Colossal Adventure , 2020, AAAI.

[31]  Leonard Adolphs,et al.  LeDeepChef: Deep Reinforcement Learning Agent for Families of Text-Based Games , 2019, AAAI.

[32]  Hannes Schulz,et al.  Frames: a corpus for adding memory to goal-oriented dialogue systems , 2017, SIGDIAL Conference.

[33]  Jason Weston,et al.  Learning to Speak and Act in a Fantasy Text Adventure Game , 2019, EMNLP.

[34]  Matthew J. Hausknecht,et al.  Graph Constrained Reinforcement Learning for Natural Language Action Spaces , 2020, ICLR.

[35]  Thibault Sellam,et al.  BLEURT: Learning Robust Metrics for Text Generation , 2020, ACL.

[36]  Jason Weston,et al.  I love your chain mail! Making knights smile in a fantasy game world: Open-domain goal-oriented dialogue agents , 2019, ArXiv.

[37]  David Wingate,et al.  What Can You Do with a Rock? Affordance Extraction via Word Embeddings , 2017, IJCAI.

[38]  Yejin Choi,et al.  COMET: Commonsense Transformers for Automatic Knowledge Graph Construction , 2019, ACL.

[39]  L. Barsalou Grounded cognition. , 2008, Annual review of psychology.

[40]  Ali Farhadi,et al.  AI2-THOR: An Interactive 3D Environment for Visual AI , 2017, ArXiv.

[41]  Jianfeng Gao,et al.  Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.

[42]  J. Feldman,et al.  Embodied meaning in a neural theory of language , 2004, Brain and Language.

[43]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[44]  Shimon Whiteson,et al.  A Survey of Reinforcement Learning Informed by Natural Language , 2019, IJCAI.

[45]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[46]  Ray Kurzweil,et al.  Learning Semantic Textual Similarity from Conversations , 2018, Rep4NLP@ACL.

[47]  Yann Dauphin,et al.  Deal or No Deal? End-to-End Learning of Negotiation Dialogues , 2017, EMNLP.

[48]  Zhendong Mao,et al.  Knowledge Graph Embedding: A Survey of Approaches and Applications , 2017, IEEE Transactions on Knowledge and Data Engineering.

[49]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[50]  Regina Barzilay,et al.  Language Understanding for Text-based Games using Deep Reinforcement Learning , 2015, EMNLP.

[51]  Mathias Niepert,et al.  Attending to Future Tokens for Bidirectional Sequence Generation , 2019, EMNLP.

[52]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[53]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[54]  Mrinmaya Sachan,et al.  Enhancing Text-based Reinforcement Learning Agents with Commonsense Knowledge , 2020, ArXiv.

[55]  Wojciech M. Czarnecki,et al.  Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.

[56]  Jianfeng Gao,et al.  Deep Reinforcement Learning with a Natural Language Action Space , 2015, ACL.

[57]  Tomas Mikolov,et al.  A Roadmap Towards Machine Intelligence , 2015, CICLing.

[58]  Jason Weston,et al.  Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring , 2020, ICLR.

[59]  Daniel Marcu,et al.  Natural Language Communication with Robots , 2016, NAACL.

[60]  Igor Mordatch,et al.  A Paradigm for Situated and Goal-Driven Language Learning , 2016, ArXiv.

[61]  Alex Graves,et al.  Automated Curriculum Learning for Neural Networks , 2017, ICML.

[62]  Demis Hassabis,et al.  Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.