Conversational Strategies for Robustly Managing Dialog in Public Spaces

Open environments present an attention management challenge for conversational systems. We describe a kiosk system (based on Ravenclaw‐Olympus) that uses simple auditory and visual information to interpret human presence and manage the system’s attention. The system robustly differentiates intended interactions from unintended ones at an accuracy of 93% and provides similar task completion rates in both a quiet room and a public space.

[1]  Christian A. Müller,et al.  Evaluation of Speech Dialog Strategies for Internet Applications in the Car , 2013, SIGDIAL Conference.

[2]  Eric Horvitz,et al.  Dialog in the open world: platform and applications , 2009, ICMI-MLMI '09.

[3]  Tetsuji Ogawa,et al.  Robot auditory system using head-mounted square microphone array , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  Satoshi Sato,et al.  Integration of Multiple Sound Source Localization Results for Speaker Identification in Multiparty Dialogue System , 2014, Natural Interaction with Robots, Knowbots and Smartphones, Putting Spoken Dialog Systems into Practice.

[5]  Tim Paek,et al.  The effect of speech interface accuracy on driving performance , 2007, INTERSPEECH.

[6]  Gregory Aist,et al.  Open Microphone Speech Understanding: Correct Discrimination of in Domain Speech , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[7]  Yukiko I. Nakano,et al.  Estimating user's engagement from eye-gaze behaviors in human-agent conversations , 2010, IUI '10.

[8]  Maria Pateraki,et al.  Two people walk into a bar: dynamic multi-party social interaction with a robot agent , 2012, ICMI '12.

[9]  Michael Riley,et al.  The Watson speech recognition engine , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Eric Horvitz,et al.  On the Challenges and Opportunities of Physically Situated Dialog , 2010, AAAI Fall Symposium: Dialog with Robots.

[11]  Gabriel Skantze,et al.  Turn-taking control using gaze in multiparty human-computer dialogue: effects of 2d and 3d displays , 2011, AVSP.

[12]  Günther Görz,et al.  Towards understanding spontaneous speech: word accuracy vs. concept accuracy , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[13]  Eric Horvitz,et al.  Models of attention in computing and communication , 2003, Commun. ACM.

[14]  Rakesh Gupta,et al.  Situated multi-modal dialog system in vehicles , 2013, GazeIn '13.

[15]  Marek P. Michalowski,et al.  Robots in the wild: observing human-robot social interaction outside the lab , 2006, 9th IEEE International Workshop on Advanced Motion Control, 2006..

[16]  Alexander I. Rudnicky,et al.  Olympus: an open-source framework for conversational spoken language interface research , 2007, HLT-NAACL 2007.

[17]  Sebastian Lang,et al.  BIRON - The Bielefeld Robot Companion , 2004 .

[18]  Yasuhisa Hasegawa,et al.  Facial expression of robot face for human-robot mutual communication , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[19]  Illah R. Nourbakhsh,et al.  The role of expressiveness and attention in human-robot interaction , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[20]  Gabriel Skantze,et al.  IrisTK: a statechart-based toolkit for multi-party face-to-face interaction , 2012, ICMI '12.

[21]  Amanda Stent,et al.  The CommandTalk Spoken Dialogue System , 1999, ACL.

[22]  Manuela M. Veloso,et al.  A Multi-modal Approach for Natural Human-Robot Interaction , 2012, ICSR.