Learning Task Knowledge from Dialog and Web Access

We present KnoWDiaL, an approach for Learning and using task-relevant Knowledge from human-robot Dialog and access to the Web. KnoWDiaL assumes that there is an autonomous agent that performs tasks, as requested by humans through speech. The agent needs to “understand” the request, (i.e., to fully ground the task until it can proceed to plan for and execute it). KnoWDiaL contributes such understanding by using and updating a Knowledge Base, by dialoguing with the user, and by accessing the web. We believe that KnoWDiaL, as we present it, can be applied to general autonomous agents. However, we focus on our work with our autonomous collaborative robot, CoBot, which executes service tasks in a building, moving around and transporting objects between locations. Hence, the knowledge acquired and accessed consists of groundings of language to robot actions, and building locations, persons, and objects. KnoWDiaL handles the interpretation of voice commands, is robust regarding speech recognition errors, and is able to learn commands involving referring expressions in an open domain, (i.e., without requiring a lexicon). We present in detail the multiple components of KnoWDiaL, namely a frame-semantic parser, a probabilistic grounding model, a web-based predicate evaluator, a dialog manager, and the weighted predicate-based Knowledge Base. We illustrate the knowledge access and updates from the dialog and Web access, through detailed and complete examples. We further evaluate the correctness of the predicate instances learned into the Knowledge Base, and show the increase in dialog efficiency as a function of the number of interactions. We have extensively and successfully used KnoWDiaL in CoBot dialoguing and accessing the Web, and extract a few corresponding example sequences from captured videos.

[1]  Moritz Tenorth,et al.  Understanding and executing instructions for everyday manipulation tasks from the World Wide Web , 2010, 2010 IEEE International Conference on Robotics and Automation.

[2]  Ipke Wachsmuth,et al.  Harvesting Wikipedia Knowledge to Identify Topics in Ongoing Natural Language Dialogs , 2011, 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[3]  Hanna M. Wallach,et al.  Conditional Random Fields: An Introduction , 2004 .

[4]  Marc Hanheide,et al.  Exploiting Probabilistic Knowledge under Uncertain Sensing for Efficient Robot Behaviour , 2011, IJCAI.

[5]  Iryna Gurevych,et al.  Contextual Coherence in Natural Language Processing , 2003, CONTEXT.

[6]  Manuela M. Veloso,et al.  Using the Web to Interactively Learn to Find Objects , 2012, AAAI.

[7]  Yanhua Mu,et al.  Task-Oriented Spoken Dialogue System for Humanoid Robot , 2010, 2010 International Conference on Multimedia Technology.

[8]  Mark Steedman,et al.  Extracting common sense knowledge from text for robot planning , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Rakesh Gupta,et al.  Common Sense Data Acquisition for Indoor Mobile Robots , 2004, AAAI.

[10]  Manuela Veloso,et al.  Evaluating Correctness of Propositions Using the Web , 2011 .

[11]  Alexander H. Waibel,et al.  A dialogue approach to learning object descriptions and semantic categories , 2008, Robotics Auton. Syst..

[12]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[13]  Markus Vincze,et al.  Web mining driven object locality knowledge acquisition for efficient robot behavior , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  Stephanie Rosenthal,et al.  An effective personal mobile robot agent through symbiotic human-robot interaction , 2010, AAMAS.

[15]  Nicholas Roy,et al.  The Motion Grammar: Linguistic Perception, Planning, and Control , 2012 .

[16]  Stefanie Tellex,et al.  Object schemas for grounding language in a responsive robot , 2008, Connect. Sci..

[17]  Wei-Ying Ma,et al.  Extracting Objects from the Web , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[18]  Manuela M. Veloso,et al.  Corrective Gradient Refinement for mobile robot localization , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[19]  Michael J. Witbrock,et al.  Searching for Common Sense: Populating Cyc™ from the Web , 2005, AAAI.

[20]  Michael Beetz,et al.  Grounding the Interaction: Anchoring Situated Discourse in Everyday Human-Robot Interaction , 2012, Int. J. Soc. Robotics.

[21]  Matthias Scheutz,et al.  What to do and how to do it: Translating natural language directives into temporal and dynamic logic representation for goal management and action execution , 2009, 2009 IEEE International Conference on Robotics and Automation.

[22]  James U. Korein,et al.  Robotics , 2018, IBM Syst. J..

[23]  Xiaoping Chen,et al.  Handling Open Knowledge for Service Robots , 2013, IJCAI.

[24]  Manuela M. Veloso,et al.  Learning environmental knowledge from task-based human-robot dialog , 2013, 2013 IEEE International Conference on Robotics and Automation.

[25]  Stefanie Tellex,et al.  Toward understanding natural language directions , 2010, HRI 2010.

[26]  Oren Etzioni,et al.  Identifying Relations for Open Information Extraction , 2011, EMNLP.

[27]  Matthew R. Walter,et al.  Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation , 2011, AAAI.

[28]  Manuela M. Veloso,et al.  Using dialog and human observations to dictate tasks to a learning robot assistant , 2008, Intell. Serv. Robotics.

[29]  Beate Hamp,et al.  The Fundamental System of Spatial Schemas in Language , 2022 .

[30]  Boris Katz,et al.  Using English for Indexing and Retrieving , 1991 .

[31]  Daisy Zhe Wang,et al.  WebTables: exploring the power of tables on the web , 2008, Proc. VLDB Endow..

[32]  Terry Winograd,et al.  Procedures As A Representation For Data In A Computer Program For Understanding Natural Language , 1971 .

[33]  Manuela M. Veloso,et al.  OpenEval: Web Information Query Evaluation , 2013, AAAI.

[34]  Daniel Jurafsky,et al.  Learning to Follow Navigational Directions , 2010, ACL.

[35]  Manuela M. Veloso,et al.  Localization and navigation of the CoBots over long-term deployments , 2013, Int. J. Robotics Res..

[36]  Francesco Orilia,et al.  Semantics and Cognition , 1991 .

[37]  B. Landau,et al.  Whence and whither in spatial language and spatial cognition , 1993 .

[38]  Silvia Coradeschi,et al.  Grounding commonsense knowledge in intelligent systems , 2009, J. Ambient Intell. Smart Environ..

[39]  Benjamin Kuipers,et al.  Walk the Talk: Connecting Language, Knowledge, and Action in Route Instructions , 2006, AAAI.

[40]  Estevam R. Hruschka,et al.  Coupled semi-supervised learning for information extraction , 2010, WSDM '10.

[41]  Xiaoping Chen,et al.  Toward open knowledge enabling for human-robot interaction , 2013, HRI 2013.

[42]  Dieter Fox,et al.  Following directions using statistical machine translation , 2010, HRI 2010.

[43]  Joanne Hardy Know it all. , 2008, Nursing standard (Royal College of Nursing (Great Britain) : 1987).

[44]  Marjorie Skubic,et al.  Spatial language for human-robot dialogs , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).