Navigational Instruction Generation as Inverse Reinforcement Learning with Neural Machine Translation

Modern robotics applications that involve human-robot interaction require robots to be able to communicate with humans seamlessly and effectively. Natural language provides a flexible and efficient medium through which robots can exchange information with their human partners. Significant advancements have been made in developing robots capable of interpreting free-form instructions, but less attention has been devoted to endowing robots with the ability to generate natural language. We propose a model that enables robots to generate natural language instructions that allow humans to navigate a priori unknown environments. We first decide which information to share with the user according to their preferences, using a policy trained from human demonstrations via inverse reinforcement learning. We then “translate” this information into a natural language instruction using a neural sequence-to-sequence model that learns to generate free-form instructions from natural language corpora. We evaluate our method on a benchmark route instruction dataset and achieve a BLEU score of 72.18% compared to human-generated reference instructions. We additionally conduct navigation experiments with human participants demonstrating that our method generates instructions that people follow as accurately and easily as those produced by humans.

[1]  Martin Schmettow,et al.  The impact of culture and recipient perspective on direction giving in the service of wayfinding , 2012 .

[2]  Anind K. Dey,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[3]  Ning Wang,et al.  Trust calibration within a human-robot team: Comparing automatically generated explanations , 2016, 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[4]  Mariët Theune,et al.  Report on the Second Second Challenge on Generating Instructions in Virtual Environments (GIVE-2.5) , 2011, ENLG.

[5]  Dan Klein,et al.  Learning Semantic Correspondences with Less Supervision , 2009, ACL.

[6]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[7]  Marilyn A. Walker,et al.  SPoT: A Trainable Sentence Planner , 2001, NAACL.

[8]  Regina A. Pomranky,et al.  The role of trust in automation reliance , 2003, Int. J. Hum. Comput. Stud..

[9]  Matthew R. Walter,et al.  Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences , 2015, AAAI.

[10]  B. Grosz Collaborative Systems , 1996 .

[11]  Daniel R. Montello,et al.  Elements of Good Route Directions in Familiar and Unfamiliar Environments , 1999, COSIT.

[12]  Johanna D. Moore,et al.  Report on the Second NLG Challenge on Generating Instructions in Virtual Environments (GIVE-2) , 2010, INLG.

[13]  Indirect Object Search based on Qualitative Spatial Relations , 2013 .

[14]  Luke S. Zettlemoyer,et al.  Weakly Supervised Learning of Semantic Parsers for Mapping Instructions to Actions , 2013, TACL.

[15]  Takayuki Kanda,et al.  Humanoid robots as a passive-social medium - a field experiment at a train station , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[16]  Maja J. Mataric,et al.  How Robot Verbal Feedback Can Improve Team Performance in Human-Robot Task Collaborations , 2015, 2015 10th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[17]  Terrence Fong,et al.  Collaboration, Dialogue, Human-Robot Interaction , 2001, ISRR.

[18]  Matthew R. Walter,et al.  Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation , 2011, AAAI.

[19]  Benjamin Kuipers,et al.  Walk the Talk: Connecting Language, Knowledge, and Action in Route Instructions , 2006, AAAI.

[20]  Dimitra Gkatzia,et al.  Generating and Evaluating Landmark-Based Navigation Instructions in Virtual Environments , 2015, ENLG.

[21]  Susan R. Fussell,et al.  How a robot should give advice , 2013, 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[22]  Mirella Lapata,et al.  Unsupervised Concept-to-text Generation with Hypergraphs , 2012, NAACL.

[23]  N. Newcombe,et al.  Turn Left at the Church, Or Three Miles North , 1986 .

[24]  Matthew R. Walter,et al.  Information-theoretic dialog to improve spatial-semantic representations , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[25]  Takayuki Kanda,et al.  Modeling environments from a route perspective , 2011, 2011 6th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[26]  Stefanie Tellex,et al.  Clarifying commands with information-theoretic human-robot dialog , 2013, HRI 2013.

[27]  Hermann Ney,et al.  LSTM Neural Networks for Language Modeling , 2012, INTERSPEECH.

[28]  Sabine Geldof,et al.  Using Natural Language Generation in Automatic Route Description , 2005, J. Res. Pract. Inf. Technol..

[29]  Ross A. Knepper,et al.  Asking for Help Using Inverse Semantics , 2014, Robotics: Science and Systems.

[30]  Takayuki Kanda,et al.  Interactive Robots as Social Partners and Peer Tutors for Children: A Field Trial , 2004, Hum. Comput. Interact..

[31]  Sean Andrist,et al.  Rhetorical robots: Making robots more effective speakers using linguistic cues of expertise , 2013, 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[32]  Mirella Lapata,et al.  Collective Content Selection for Concept-to-Text Generation , 2005, HLT.

[33]  Raymond J. Mooney,et al.  Learning to Interpret Natural Language Navigation Instructions from Observations , 2011, Proceedings of the AAAI Conference on Artificial Intelligence.

[34]  Y. Lippa,et al.  Landmarks as beacons and associative cues: Their role in route learning , 2007, Memory & cognition.

[35]  Matthew R. Walter,et al.  What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine Alignment , 2015, NAACL.

[36]  Raymond J. Mooney,et al.  Generative Alignment and Semantic Parsing for Learning from Ambiguous Supervision , 2010, COLING.

[37]  Dieter Fox,et al.  Following directions using statistical machine translation , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[38]  Gary L. Allen,et al.  From Knowledge to Words to Wayfinding: Issues in the Production and Comprehension of Route Directions , 1997, COSIT.

[39]  Wolfram Burgard,et al.  Learning to give route directions from human demonstrations , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[40]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[41]  Barbara J. Grosz,et al.  Collaborative Systems (AAAI-94 Presidential Address) , 1996 .

[42]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[43]  Raymond J. Mooney,et al.  Generation by Inverting a Semantic Parser that Uses Statistical Machine Translation , 2007, NAACL.

[44]  Kai-Florian Richter,et al.  Simplest Instructions: Finding Easy-to-Describe Routes for Navigation , 2008, GIScience.

[45]  V. Groom,et al.  Can robots be teammates?: Benchmarks in human–robot teams , 2007 .

[46]  Alexander Klippel,et al.  Algorithms for Reliable Navigation and Wayfinding , 2006, Spatial Cognition.

[47]  Allison Sauppé,et al.  Effective task training strategies for human and robot instructors , 2015, Auton. Robots.

[48]  Nick Hawes,et al.  Using Qualitative Spatial Relations for indirect object search , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[49]  Regina Barzilay,et al.  Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization , 2004, NAACL.

[50]  Raymond J. Mooney,et al.  Learning to sportscast: a test of grounded language acquisition , 2008, ICML '08.

[51]  Stefanie Tellex,et al.  A natural language planner interface for mobile manipulators , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[52]  Dan Klein,et al.  A Simple Domain-Independent Probabilistic Approach to Generation , 2010, EMNLP.

[53]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[54]  Robert Laddaga,et al.  A location representation for generating descriptive walking directions , 2005, IUI.

[55]  Luke Fletcher,et al.  Multimodal interaction with an autonomous forklift , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[56]  Terrence Fong,et al.  Collaboration, Dialogue, and Human-Robot Interaction , 2001 .

[57]  Martin Buss,et al.  Route description interpretation on automatically labeled robot maps , 2013, 2013 IEEE International Conference on Robotics and Automation.

[58]  Nina Dethlefs,et al.  Generating Adaptive Route Instructions Using Hierarchical Reinforcement Learning , 2010, Spatial Cognition.

[59]  Hadas Kress-Gazit,et al.  Sorry Dave, I'm Afraid I Can't Do That: Explaining Unachievable Robot Tasks Using Natural Language , 2013, Robotics: Science and Systems.

[60]  Edwin Olson,et al.  DART: A particle-based method for generating easy-to-follow directions , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[61]  Stefanie Tellex,et al.  Toward understanding natural language directions , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[62]  Matthew R. Walter,et al.  On the performance of hierarchical distributed correspondence graphs for efficient symbol grounding of robot instructions , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[63]  Stefanie Tellex,et al.  Toward Information Theoretic Human-Robot Dialog , 2012, Robotics: Science and Systems.

[64]  Kerstin Dautenhahn,et al.  Robotic etiquette: Results from user studies involving a fetch and carry task , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[65]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[66]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[67]  Gary Wai Keung Look Cognitively-inspired direction giving , 2008 .

[68]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.