论文信息 - Navigational Instruction Generation as Inverse Reinforcement Learning with Neural Machine Translation

Navigational Instruction Generation as Inverse Reinforcement Learning with Neural Machine Translation

Modern robotics applications that involve human-robot interaction require robots to be able to communicate with humans seamlessly and effectively. Natural language provides a flexible and efficient medium through which robots can exchange information with their human partners. Significant advancements have been made in developing robots capable of interpreting free-form instructions, but less attention has been devoted to endowing robots with the ability to generate natural language. We propose a model that enables robots to generate natural language instructions that allow humans to navigate a priori unknown environments. We first decide which information to share with the user according to their preferences, using a policy trained from human demonstrations via inverse reinforcement learning. We then “translate” this information into a natural language instruction using a neural sequence-to-sequence model that learns to generate free-form instructions from natural language corpora. We evaluate our method on a benchmark route instruction dataset and achieve a BLEU score of 72.18% compared to human-generated reference instructions. We additionally conduct navigation experiments with human participants demonstrating that our method generates instructions that people follow as accurately and easily as those produced by humans.

Matthew R. Walter | Mohit Bansal | Andrea F. Daniele | Mohit Bansal

[1] Martin Schmettow,et al. The impact of culture and recipient perspective on direction giving in the service of wayfinding , 2012 .

[2] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[3] Ning Wang,et al. Trust calibration within a human-robot team: Comparing automatically generated explanations , 2016, 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[4] Mariët Theune,et al. Report on the Second Second Challenge on Generating Instructions in Virtual Environments (GIVE-2.5) , 2011, ENLG.

[5] Dan Klein,et al. Learning Semantic Correspondences with Less Supervision , 2009, ACL.

[6] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[7] Marilyn A. Walker,et al. SPoT: A Trainable Sentence Planner , 2001, NAACL.

[8] Regina A. Pomranky,et al. The role of trust in automation reliance , 2003, Int. J. Hum. Comput. Stud..

[9] Matthew R. Walter,et al. Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences , 2015, AAAI.

[10] B. Grosz. Collaborative Systems , 1996 .

[11] Daniel R. Montello,et al. Elements of Good Route Directions in Familiar and Unfamiliar Environments , 1999, COSIT.

[12] Johanna D. Moore,et al. Report on the Second NLG Challenge on Generating Instructions in Virtual Environments (GIVE-2) , 2010, INLG.

[13] Indirect Object Search based on Qualitative Spatial Relations , 2013 .

[14] Luke S. Zettlemoyer,et al. Weakly Supervised Learning of Semantic Parsers for Mapping Instructions to Actions , 2013, TACL.

[15] Takayuki Kanda,et al. Humanoid robots as a passive-social medium - a field experiment at a train station , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[16] Maja J. Mataric,et al. How Robot Verbal Feedback Can Improve Team Performance in Human-Robot Task Collaborations , 2015, 2015 10th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[17] Terrence Fong,et al. Collaboration, Dialogue, Human-Robot Interaction , 2001, ISRR.

[18] Matthew R. Walter,et al. Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation , 2011, AAAI.

[19] Benjamin Kuipers,et al. Walk the Talk: Connecting Language, Knowledge, and Action in Route Instructions , 2006, AAAI.

[20] Dimitra Gkatzia,et al. Generating and Evaluating Landmark-Based Navigation Instructions in Virtual Environments , 2015, ENLG.

[21] Susan R. Fussell,et al. How a robot should give advice , 2013, 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[22] Mirella Lapata,et al. Unsupervised Concept-to-text Generation with Hypergraphs , 2012, NAACL.

[23] N. Newcombe,et al. Turn Left at the Church, Or Three Miles North , 1986 .

[24] Matthew R. Walter,et al. Information-theoretic dialog to improve spatial-semantic representations , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[25] Takayuki Kanda,et al. Modeling environments from a route perspective , 2011, 2011 6th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[26] Stefanie Tellex,et al. Clarifying commands with information-theoretic human-robot dialog , 2013, HRI 2013.

[27] Hermann Ney,et al. LSTM Neural Networks for Language Modeling , 2012, INTERSPEECH.

[28] Sabine Geldof,et al. Using Natural Language Generation in Automatic Route Description , 2005, J. Res. Pract. Inf. Technol..

[29] Ross A. Knepper,et al. Asking for Help Using Inverse Semantics , 2014, Robotics: Science and Systems.

[30] Takayuki Kanda,et al. Interactive Robots as Social Partners and Peer Tutors for Children: A Field Trial , 2004, Hum. Comput. Interact..

[31] Sean Andrist,et al. Rhetorical robots: Making robots more effective speakers using linguistic cues of expertise , 2013, 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[32] Mirella Lapata,et al. Collective Content Selection for Concept-to-Text Generation , 2005, HLT.

[33] Raymond J. Mooney,et al. Learning to Interpret Natural Language Navigation Instructions from Observations , 2011, Proceedings of the AAAI Conference on Artificial Intelligence.

[34] Y. Lippa,et al. Landmarks as beacons and associative cues: Their role in route learning , 2007, Memory & cognition.

[35] Matthew R. Walter,et al. What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine Alignment , 2015, NAACL.

[36] Raymond J. Mooney,et al. Generative Alignment and Semantic Parsing for Learning from Ambiguous Supervision , 2010, COLING.

[37] Dieter Fox,et al. Following directions using statistical machine translation , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[38] Gary L. Allen,et al. From Knowledge to Words to Wayfinding: Issues in the Production and Comprehension of Route Directions , 1997, COSIT.

[39] Wolfram Burgard,et al. Learning to give route directions from human demonstrations , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[40] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[41] Barbara J. Grosz,et al. Collaborative Systems (AAAI-94 Presidential Address) , 1996 .

[42] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[43] Raymond J. Mooney,et al. Generation by Inverting a Semantic Parser that Uses Statistical Machine Translation , 2007, NAACL.

[44] Kai-Florian Richter,et al. Simplest Instructions: Finding Easy-to-Describe Routes for Navigation , 2008, GIScience.

[45] V. Groom,et al. Can robots be teammates?: Benchmarks in human–robot teams , 2007 .

[46] Alexander Klippel,et al. Algorithms for Reliable Navigation and Wayfinding , 2006, Spatial Cognition.

[47] Allison Sauppé,et al. Effective task training strategies for human and robot instructors , 2015, Auton. Robots.

[48] Nick Hawes,et al. Using Qualitative Spatial Relations for indirect object search , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[49] Regina Barzilay,et al. Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization , 2004, NAACL.

[50] Raymond J. Mooney,et al. Learning to sportscast: a test of grounded language acquisition , 2008, ICML '08.

[51] Stefanie Tellex,et al. A natural language planner interface for mobile manipulators , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[52] Dan Klein,et al. A Simple Domain-Independent Probabilistic Approach to Generation , 2010, EMNLP.

[53] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[54] Robert Laddaga,et al. A location representation for generating descriptive walking directions , 2005, IUI.

[55] Luke Fletcher,et al. Multimodal interaction with an autonomous forklift , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[56] Terrence Fong,et al. Collaboration, Dialogue, and Human-Robot Interaction , 2001 .

[57] Martin Buss,et al. Route description interpretation on automatically labeled robot maps , 2013, 2013 IEEE International Conference on Robotics and Automation.

[58] Nina Dethlefs,et al. Generating Adaptive Route Instructions Using Hierarchical Reinforcement Learning , 2010, Spatial Cognition.

[59] Hadas Kress-Gazit,et al. Sorry Dave, I'm Afraid I Can't Do That: Explaining Unachievable Robot Tasks Using Natural Language , 2013, Robotics: Science and Systems.

[60] Edwin Olson,et al. DART: A particle-based method for generating easy-to-follow directions , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[61] Stefanie Tellex,et al. Toward understanding natural language directions , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[62] Matthew R. Walter,et al. On the performance of hierarchical distributed correspondence graphs for efficient symbol grounding of robot instructions , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[63] Stefanie Tellex,et al. Toward Information Theoretic Human-Robot Dialog , 2012, Robotics: Science and Systems.

[64] Kerstin Dautenhahn,et al. Robotic etiquette: Results from user studies involving a fetch and carry task , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[65] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[66] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[67] Gary Wai Keung Look. Cognitively-inspired direction giving , 2008 .

[68] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.