Intonation in Robot Speech: Does it work the same as with people?

Human-robot interaction (HRI) research aims to design natural interactions between humans and robots. Intonation, a social signaling function in human speech investigated thoroughly in linguistics, has not yet been studied in HRI. This study investigates the effect of robot speech intonation in four conditions (no intonation, focus intonation, end-of-utterance intonation, or combined intonation) on conversational naturalness, social engagement, and people’s humanlike perception of the robot collecting objective and subjective data of participant conversations (n=120). Our results showed that humanlike intonation partially improved subjective naturalness but not observed fluency, and that intonation partially improved social engagement but did not affect humanlike perceptions of the robot. Given that our results mainly differed from our hypotheses based on human speech intonation, we discuss the implications and provide suggestions for future research to further investigate conversational naturalness in robot speech intonation. ACM Reference Format: Ella Velner, Paul P.G. Boersma, and Maartje M.A. de Graaf. 2020. Intonation in Robot Speech: Does it work the same as with people?. In Proceedings of the 2020 ACM/IEEE International Conference on Human-Robot Interaction (HRI’20), March 23-26, 2020, Cambridge, United Kingdom. ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/3319502.3374801

[1]  Peter A. Gloor,et al.  In the shades of the uncanny valley: An experimental study of human-chatbot interaction , 2018, Future Gener. Comput. Syst..

[2]  Chandima Dedduwa Pathiranage,et al.  A Wizard of Oz Study of Human Interest Towards Robot Initiated Human-Robot Interaction , 2018, 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN).

[3]  Elena Lazkano,et al.  Adaptive Emotional Chatting Behavior to Increase the Sociability of Robots , 2017, ICSR.

[4]  Karl F. MacDorman,et al.  Measuring the Uncanny Valley Effect , 2017, Int. J. Soc. Robotics.

[5]  Somaya Ben Allouch,et al.  The Influence of Prior Expectations of a Robot’s Lifelikeness on Users’ Intentions to Treat a Zoomorphic Robot as a Companion , 2016, International Journal of Social Robotics.

[6]  Srikanth Ronanki,et al.  A Template-Based Approach for Speech Synthesis Intonation Generation Using LSTMs , 2016, INTERSPEECH.

[7]  Taezoon Park,et al.  Types of humor that robots can play , 2016, Comput. Hum. Behav..

[8]  Cindy L. Bethel,et al.  A Survey of Using Vocal Prosody to Convey Emotion in Robot Speech , 2016, Int. J. Soc. Robotics.

[9]  Tony Belpaeme,et al.  From characterising three years of HRI to methodology and reporting recommendations , 2016, 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[10]  C. L. V. Straten Looks good, sounds nice: Intonation and bodily appearance in robot-mediated communicative treatment for children with autism , 2016 .

[11]  Sanjay Ghosh,et al.  Enabling Naturalness and Humanness in Mobile Voice Assistants , 2015, INTERACT.

[12]  W. R. Ford,et al.  Real conversations with artificial intelligence: A comparison between human-human online conversations and human-chatbot conversations , 2015, Comput. Hum. Behav..

[13]  Wendy A. Rogers,et al.  Why Some Humanoid Faces Are Perceived More Positively Than Others: Effects of Human-Likeness and Task , 2014, International Journal of Social Robotics.

[14]  Kerstin Fischer,et al.  The Sound Makes the Greeting: Interpersonal Functions of Intonation in Human-Robot Interaction , 2015, AAAI Spring Symposia.

[15]  Nikolaos Mavridis,et al.  A review of verbal and non-verbal human-robot interactive communication , 2014, Robotics Auton. Syst..

[16]  Maartje M. A. de Graaf,et al.  The relation between people's attitude and anxiety towards robots in human-robot interaction , 2013, 2013 IEEE RO-MAN.

[17]  Stefan Kopp,et al.  To Err is Human(-like): Effects of Robot Gesture on Perceived Anthropomorphism and Likability , 2013, International Journal of Social Robotics.

[18]  Sonya S. Kwak,et al.  Am I acceptable to you? Effect of a robot's verbal language forms on people's social distance from robots , 2013, Comput. Hum. Behav..

[19]  Manfred Tscheligi,et al.  Face-to-Face with a Robot: What do we actually Talk about? , 2013, Int. J. Humanoid Robotics.

[20]  Shuichi Nishio,et al.  Do robot appearance and speech affect people's attitude? Evaluation through the Ultimatum Game , 2012, 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication.

[21]  Ana Paiva,et al.  Long-Term Interactions with Empathic Robots: Evaluating Perceived Support in Children , 2012, ICSR.

[22]  Steven E. Clayman,et al.  Turn‐Constructional Units and the Transition‐Relevance Place , 2012 .

[23]  Yelena Mitrofanova,et al.  Raising EFL students’ awareness of English intonation functioning , 2012 .

[24]  Karl F. MacDorman,et al.  The Uncanny Valley [From the Field] , 2012, IEEE Robotics Autom. Mag..

[25]  Friederike Eyssel,et al.  Effects of anticipated human-robot interaction and predictability of robot behavior on perceptions of anthropomorphism , 2011, 2011 6th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[26]  Candace L. Sidner,et al.  Recognizing engagement in human-robot interaction , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[27]  Hiroshi Ishiguro,et al.  How about laughter? Perceived naturalness of two laughing humanoid robots , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[28]  Bruce A. MacDonald,et al.  Towards a flexible platform for voice accent and expression selection on a Healthcare Robot , 2009, ALTA.

[29]  Vanessa Evers,et al.  Measuring acceptance of an assistive social robot: a suggested toolkit , 2009, RO-MAN 2009 - The 18th IEEE International Symposium on Robot and Human Interactive Communication.

[30]  Ana Paiva,et al.  Detecting user engagement with a robot companion using task and social interaction-based features , 2009, ICMI-MLMI '09.

[31]  Avelino J. Gonzalez,et al.  Towards a method for evaluating naturalness in conversational dialog systems , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[32]  P. Kay,et al.  Universals and cultural variation in turn-taking in conversation , 2009, Proceedings of the National Academy of Sciences.

[33]  D. Hirst,et al.  Intonation in Swedish , 2009 .

[34]  Dana Kulic,et al.  Measurement Instruments for the Anthropomorphism, Animacy, Likeability, Perceived Intelligence, and Perceived Safety of Robots , 2009, Int. J. Soc. Robotics.

[35]  Tatsuya Nomura,et al.  Prediction of Human Behavior in Human--Robot Interaction Using Psychological Scales for Anxiety and Negative Attitudes Toward Robots , 2008, IEEE Transactions on Robotics.

[36]  Ulrike Gut,et al.  An introduction to intonation – functions and models , 2007 .

[37]  Kathleen Currie Hall,et al.  Language files : materials for an introduction to language and linguistics , 2007 .

[38]  Heloir,et al.  The Uncanny Valley , 2019, The Animation Studies Reader.

[39]  Martine Grice,et al.  The intonation of accessibility , 2006 .

[40]  Jean Scholtz,et al.  Common metrics for human-robot interaction , 2006, HRI '06.

[41]  S. Baumann,et al.  AN INTRODUCTION TO INTONATION – FUNCTIONS AND MODELS , 2006 .

[42]  Aoju Chen,et al.  Universal and language-specific perception of paralinguistic intonational meaning , 2005 .

[43]  K. Krippendorff Reliability in Content Analysis: Some Common Misconceptions and Recommendations , 2004 .

[44]  Cynthia Breazeal,et al.  Social interactions in HRI: the robot view , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[45]  Illah R. Nourbakhsh,et al.  A survey of socially interactive robots , 2003, Robotics Auton. Syst..

[46]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[47]  Amy J. C. Cuddy,et al.  A model of (often mixed) stereotype content: competence and warmth respectively follow from perceived status and competition. , 2002, Journal of personality and social psychology.

[48]  Cynthia Breazeal Emotive qualities in robot speech , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[49]  Emiel Krahmer,et al.  On the alleged existence of contrastive accents , 2001, Speech Commun..

[50]  M. Erb,et al.  Are emotions contagious? Evoked emotions while viewing emotionally expressive faces: quality, quantity, time course and gender differences , 2001, Psychiatry Research.

[51]  Björn Granström,et al.  Developments and paradigms in intonation research , 2001, Speech Commun..

[52]  Andy P. Field,et al.  Discovering Statistics Using SPSS , 2000 .

[53]  Angela Cora Garcia,et al.  The Eyes of the Beholder: Understanding the Turn-Taking System in Quasi-Synchronous Computer-Mediated Communication , 1999 .

[54]  A. D. Dominicis,et al.  Intonation Systems: A Survey of Twenty Languages , 1999 .

[55]  J. 't Hart Intonation in Dutch , 1998 .

[56]  E. Schegloff Interaction and grammar: Turn organization: one intersection of grammar and interaction , 1996 .

[57]  K. Scherer,et al.  Acoustic profiles in vocal emotion expression. , 1996, Journal of personality and social psychology.

[58]  D E Yoder,et al.  Gaze behavior: A new look at an old problem , 1983, Journal of autism and developmental disorders.

[59]  E. Schegloff Discourse as an interactional achievement : Some uses of "Uh huh" and other things that come between sentences , 1982 .

[60]  P. Andersen,et al.  The perceptual world of the communication apprehensive: The effect of communication apprehension and interpersonal gaze on interpersonal perception , 1980 .

[61]  E. Schegloff,et al.  A simplest systematics for the organization of turn-taking for conversation , 1974 .