Two people walk into a bar: dynamic multi-party social interaction with a robot agent

We introduce a humanoid robot bartender that is capable of dealing with multiple customers in a dynamic, multi-party social setting. The robot system incorporates state-of-the-art components for computer vision, linguistic processing, state management, high-level reasoning, and robot control. In a user evaluation, 31 participants interacted with the bartender in a range of social situations. Most customers successfully obtained a drink from the bartender in all scenarios, and the factors that had the greatest impact on subjective satisfaction were task success and dialogue efficiency.

[1]  Ashish Kapoor,et al.  Automatic prediction of frustration , 2007, Int. J. Hum. Comput. Stud..

[2]  P. Trahanias,et al.  Visual tracking of hands , faces and facial features as a basis for human-robot communication , 2011 .

[3]  D. Misaki,et al.  Development of Japanese green tea serving robot "T-Bartender" , 2005, IEEE International Conference Mechatronics and Automation, 2005.

[4]  Oliver Lemon,et al.  Reinforcement Learning for Adaptive Dialogue Systems - A Data-driven Methodology for Dialogue Management and Natural Language Generation , 2011, Theory and Applications of Natural Language Processing.

[5]  Eric Horvitz,et al.  Dialog in the open world: platform and applications , 2009, ICMI-MLMI '09.

[6]  Maria Pateraki,et al.  Using Dempster's rule of combination to robustly estimate pointed targets , 2012, 2012 IEEE International Conference on Robotics and Automation.

[7]  Markus Rickert,et al.  Efficient Motion Planning for Intuitive Task Execution in Modular Manipulation Systems , 2011 .

[8]  Antonis A. Argyros,et al.  Propagation of Pixel Hypotheses for Multiple Objects Tracking , 2009, ISVC.

[9]  Ana Paiva,et al.  Affect recognition for interactive companions: challenges and design in real world scenarios , 2009, Journal on Multimodal User Interfaces.

[10]  Dana Kulic,et al.  Measurement Instruments for the Anthropomorphism, Animacy, Likeability, Perceived Intelligence, and Perceived Safety of Robots , 2009, Int. J. Soc. Robotics.

[11]  Frank Mueller,et al.  Preface , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[12]  Maja Pantic,et al.  Social signal processing: Survey of an emerging domain , 2009, Image Vis. Comput..

[13]  Elena Márquez Segura,et al.  How Do You Like Me in This: User Embodiment Preferences for Companion Agents , 2012, IVA.

[14]  Shimei Pan,et al.  Designing and Evaluating an Adaptive Spoken Dialogue System , 2002, User Modeling and User-Adapted Interaction.

[15]  Manolis I. A. Lourakis,et al.  Three-dimensional tracking of multiple skin-colored regions by a moving stereoscopic system. , 2004, Applied optics.

[16]  Amy Isard,et al.  Situated Reference in a Hybrid Human-Robot Interaction System , 2010, INLG.

[17]  Brian Scassellati,et al.  The Benefits of Interactions with Physically Present Robots over Video-Displayed Agents , 2011, Int. J. Soc. Robotics.

[18]  Michael White,et al.  Efficient Realization of Coordinate Structures in Combinatory Categorial Grammar , 2006 .

[19]  Ronald P. A. Petrick,et al.  What Would You Like to Drink? Recognising and Planning with Social States in a Robot Bartender Domain , 2012, CogRob@AAAI.

[20]  Fahiem Bacchus,et al.  Extending the Knowledge-Based Approach to Planning with Incomplete Information and Sensing , 2004, ICAPS.

[21]  Staffan Larsson,et al.  Information state and dialogue management in the TRINDI dialogue move engine toolkit , 2000, Natural Language Engineering.

[22]  Maja J. Mataric,et al.  Embodiment and Human-Robot Interaction: A Task-Based Perspective , 2007, RO-MAN 2007 - The 16th IEEE International Symposium on Robot and Human Interactive Communication.

[23]  Takayuki Kanda,et al.  Footing in human-robot conversations: How robots might shape participant roles using gaze cues , 2009, 2009 4th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[24]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[25]  Tetsunori Kobayashi,et al.  Conversation Robot Participating in Group Conversation , 2003 .

[26]  Kerstin Dautenhahn,et al.  Socially intelligent robots: dimensions of human–robot interaction , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[27]  Cynthia Breazeal,et al.  Socially intelligent robots: research, development, and applications , 2001, 2001 IEEE International Conference on Systems, Man and Cybernetics. e-Systems and e-Man for Cybernetics in Cyberspace (Cat.No.01CH37236).

[28]  Marilyn A. Walker,et al.  An Application of Reinforcement Learning to Dialogue Strategy Selection in a Spoken Dialogue System for Email , 2000, J. Artif. Intell. Res..

[29]  Marilyn A. Walker,et al.  Towards developing general models of usability with PARADISE , 2000, Natural Language Engineering.

[30]  Panos E. Trahanias,et al.  Visual tracking of independently moving body and arms , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[31]  Maria Pateraki,et al.  Visual tracking of hands, faces and facial features of multiple persons , 2012, Machine Vision and Applications.

[32]  Fahiem Bacchus,et al.  A Knowledge-Based Approach to Planning with Incomplete Information and Sensing , 2002, AIPS.

[33]  Oliver Lemon,et al.  Recent research advances in Reinforcement Learning in Spoken Dialogue Systems , 2009, The Knowledge Engineering Review.

[34]  Richard Fikes,et al.  STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.