Reducing Speech Collisions by Using an Artificial Subtle Expression in a Decelerated Spoken Dialogue

We argue that spoken dialogue systems or communication robots do not need to quickly respond verbally as long as they quickly respond non-verbally by showing their internal states by using an artificial subtle expression. This paper describes an experiment whose results support this point. In this experiment, 48 participants engaged in reservation tasks with a spoken dialogue system coupled with an interface robot using a blinking light expression. The expression is designed as an artificial subtle expression to intuitively notify a user about a robot’s internal states (such as processing) for the sake of reducing speech collisions as consequences of turn-taking failures due to end-of-turn misdetection. Speech collisions harm smooth speech communication and degrade system usability. Two experimental factors were setup: the blinking light factor (with or without a blinking light) and the reply speed factor (moderate or slow reply speed), resulting in four experimental conditions. The results suggest that the blinking light expression can reduce speech collisions and improve user impression, and surprisingly that users do not care about slow replies.

[1]  Christoph Bartneck,et al.  Subtle emotional expressions of synthetic characters , 2005, Int. J. Hum. Comput. Stud..

[2]  Kenji Araki,et al.  Analysis of User Reactions to Turn-Taking Failures in Spoken Dialogue Systems , 2007, SIGdial.

[3]  A. Kendon Some functions of gaze-direction in social interaction. , 1967, Acta psychologica.

[4]  Susan Goldin-Meadow,et al.  Do gestures communicate , 2005 .

[5]  Clifford Nass,et al.  Computers that care: investigating the effects of orientation of emotion exhibited by an embodied computer agent , 2005, Int. J. Hum. Comput. Stud..

[6]  Seiichi Nakagawa,et al.  Response Timing Detection Using Prosodic and Linguistic Information for Human-friendly Spoken Dialog Systems (論文特集:人間と共生する情報システム) , 2005 .

[7]  Seiji Yamada,et al.  How do robotic agents' appearances affect people's interpretations of the agents' attitudes? , 2007, CHI Extended Abstracts.

[8]  Masanobu Abe,et al.  A Japanese TTS system based on multiform units and a speech modification algorithm with harmonics reconstruction , 2001, IEEE Trans. Speech Audio Process..

[9]  Johan Boye,et al.  Real-time Handling of Fragmented Utterances , 2001 .

[10]  Yasuo Horiuchi,et al.  Investigation of the relationship between turn-taking and prosodic features in spontaneous dialogue , 2005, INTERSPEECH.

[11]  Seiji Yamada,et al.  Artificial subtle expressions: intuitive notification methodology of artifacts , 2010, CHI.

[12]  Nigel Ward,et al.  Evaluating responsiveness in spoken dialog systems , 2000, INTERSPEECH.

[13]  W. Rogers,et al.  THE CONTRIBUTION OF KINESIC ILLUSTRATORS TOWARD THE COMPREHENSION OF VERBAL BEHAVIOR WITHIN UTTERANCES , 1978 .

[14]  Maxine Eskénazi,et al.  Optimizing Endpointing Thresholds using Dialogue Features in a Spoken Dialogue System , 2008, SIGDIAL Workshop.

[15]  Takayuki Kanda,et al.  How quickly should communication robots respond? , 2008, 2008 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[16]  Seiji Yamada,et al.  Smoothing human-robot speech interactions by using a blinking-light as subtle expression , 2008, ICMI '08.

[17]  Mitsuru Ishizuka,et al.  Using human physiology to evaluate subtle expressivity of a virtual quizmaster in a mathematical game , 2005, Int. J. Hum. Comput. Stud..

[18]  Nigel Ward,et al.  On the Expressive Competencies Needed for Responsive Systems , 2003 .