Active Learning for Generating Motion and Utterances in Object Manipulation Dialogue Tasks

In an object manipulation dialogue, a robot may misunderstand an ambiguous command from a user, such as 'Place the cup down (on the table)," potentially resulting in an accident. Although making confirmation questions before all motion execution will decrease the risk of this failure, the user will find it more convenient if confirmation questions are not made under trivial situations. This paper proposes a method for estimating ambiguity in commands by introducing an active learning framework with Bayesian logistic regression to human-robot spoken dialogue. We conducted physical experiments in which a user and a manipulator-based robot communicated using spoken language to manipulate objects.

[1]  Alexander I. Rudnicky,et al.  Sorry and I Didn’t Catch That! - An Investigation of Non-understanding Errors and Recovery Strategies , 2005, SIGDIAL.

[2]  Pierre Lison,et al.  Salience-driven Contextual Priming of Speech Recognition for Human-Robot Interaction , 2008, ECAI.

[3]  Naoto Iwahashi,et al.  Robots That Learn Language: Developmental Approach to Human-Machine Conversations , 2006, EELC.

[4]  Yoshihiko Nakamura,et al.  Embodied Symbol Emergence Based on Mimesis Theory , 2004, Int. J. Robotics Res..

[5]  Andrew McCallum,et al.  Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[6]  Naoto Iwahashi,et al.  Learning object-manipulation verbs for human-robot communication , 2007, WMISI '07.

[7]  Robert Dale,et al.  Computational Interpretations of the Gricean Maxims in the Generation of Referring Expressions , 1995, Cogn. Sci..

[8]  Satoshi Nakamura,et al.  Bayesian learning of confidence measure function for generation of utterances and motions in object manipulation dialogue task , 2009, INTERSPEECH.

[9]  Jun Tani,et al.  Learning Semantic Combinatoriality from the Interaction between Linguistic and Behavioral Processes , 2005, Adapt. Behav..

[10]  Marilyn A. Walker,et al.  Learning Content Selection Rules for Generating Object Descriptions in Dialogue , 2005, J. Artif. Intell. Res..

[11]  Christopher W. Geib,et al.  The meaning of action: a review on action recognition and mapping , 2007, Adv. Robotics.

[12]  David Madigan,et al.  Large-Scale Bayesian Logistic Regression for Text Categorization , 2007, Technometrics.

[13]  Yoshihiko Nakamura,et al.  Statistically integrated semiotics that enables mutual inference between linguistic and behavioral symbols for humanoid robots , 2009, 2009 IEEE International Conference on Robotics and Automation.

[14]  Tatsuya Kawahara,et al.  Flexible Mixed-Initiative Dialogue Management using Concept-Level Confidence Measures of Speech Recognizer Output , 2000, COLING.

[15]  Takenobu Tokunaga,et al.  A Probabilistic Model of Referring Expressions for Complex Objects , 2009, ENLG.

[16]  S. Haykin,et al.  Pattern Recognition Using a Family of Design Algorithms Based upon the Generalized Probabilistic Descent Method , 2001 .

[17]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[18]  Tetsuya Ogata,et al.  Two-way translation of compound sentences and arm motions by recurrent neural networks , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[19]  Oliver Lemon,et al.  User Simulations for Context-Sensitive Speech Recognition in Spoken Dialogue Systems , 2009, EACL.