Empathetic Speech Synthesis and Testing for Healthcare Robots

One of the major factors that affect the acceptance of robots in Human-Robot Interaction applications is the type of voice with which they interact with humans. The robot’s voice can be used to express empathy, which is an affective response of the robot to the human user. In this study, the aim is to find out if social robots with empathetic voice are acceptable for users in healthcare applications. A pilot study using an empathetic voice spoken by a voice actor was conducted. Only prosody in speech is used to express empathy here, without any visual cues. Also, the emotions needed for an empathetic voice are identified. It was found that the emotions needed are not only the stronger primary emotions, but also the nuanced secondary emotions. These emotions are then synthesised using prosody modelling. A second study, replicating the pilot test is conducted using the synthesised voices to investigate if empathy is perceived from the synthetic voice as well. This paper reports the modelling and synthesises of an empathetic voice, and experimentally shows that people prefer empathetic voice for healthcare robots. The results can be further used to develop empathetic social robots, that can improve people’s acceptance of social robots.

[1]  A. Damasio Descartes' error: emotion, reason, and the human brain. avon books , 1994 .

[2]  Nikolaos Mavridis,et al.  A review of verbal and non-verbal human-robot interactive communication , 2014, Robotics Auton. Syst..

[3]  Andy P. Field,et al.  Discovering Statistics Using SPSS , 2000 .

[4]  T. D. Kemper How Many Emotions Are There? Wedding the Social and the Autonomic Components , 1987, American Journal of Sociology.

[5]  Marc Schröder,et al.  Emotional speech synthesis: a review , 2001, INTERSPEECH.

[6]  Diego A. Evin,et al.  Acoustic correlates of perceived syllable prominence in German , 2015, INTERSPEECH.

[7]  Minoru Asada,et al.  Towards Artificial Empathy , 2015, Int. J. Soc. Robotics.

[8]  Ipke Wachsmuth,et al.  Affective computing with primary and secondary emotions in a virtual human , 2009, Autonomous Agents and Multi-Agent Systems.

[9]  Michael Picheny,et al.  A corpus-based approach to expressive speech synthesis , 2004, SSW.

[10]  Clifford Nass,et al.  Wired for Speech: How Voice Activates and Advances the Human-Computer Relationship , 2005 .

[11]  Rajiv Khosla,et al.  Service innovation through social robot engagement to improve dementia care quality , 2017, Assistive technology : the official journal of RESNA.

[12]  Susan R. Fussell,et al.  Comparing a computer agent with a humanoid robot , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[13]  Bruce A. MacDonald,et al.  Retirement home staff and residents' preferences for healthcare robots , 2009, RO-MAN 2009 - The 18th IEEE International Symposium on Robot and Human Interactive Communication.

[14]  Cindy L. Bethel,et al.  Validation of vocal prosody modifications to communicate emotion in robot speech , 2015, 2015 International Conference on Collaboration Technologies and Systems (CTS).

[15]  Haizhou Li,et al.  Making Social Robots More Attractive: The Effects of Voice Pitch, Humor and Empathy , 2013, Int. J. Soc. Robotics.

[16]  C. Watson,et al.  Influence of Prosodic features and semantics on secondary emotion production and perception , 2019 .

[17]  Laura K. Taylor,et al.  Empathy: A Review of the Concept , 2016 .

[18]  Marc Schröder,et al.  The German Text-to-Speech Synthesis System MARY: A Tool for Research, Development and Teaching , 2003, Int. J. Speech Technol..

[19]  Brian Scassellati,et al.  A Context-Dependent Attention System for a Social Robot , 1999, IJCAI.

[20]  Axel Röbel,et al.  Sequence-to-sequence Modelling of F0 for Speech Emotion Conversion , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[21]  Ana Paiva,et al.  Empathic Robots for Long-term Interaction , 2014, Int. J. Soc. Robotics.

[22]  Sofiane Boucenna,et al.  Evaluating the Engagement with Social Robots , 2015, International Journal of Social Robotics.

[23]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[24]  Brian R. Duffy,et al.  Anthropomorphism and the social robot , 2003, Robotics Auton. Syst..

[25]  Bruce A. MacDonald,et al.  The Effects of Synthesized Voice Accents on User Perceptions of Robots , 2011, Int. J. Soc. Robotics.

[26]  Ipke Wachsmuth,et al.  A Computational Model of Empathy: Empirical Evaluation , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[27]  Li Tian,et al.  An Open Source Emotional Speech Corpus for Human Robot Interaction Applications , 2018, INTERSPEECH.

[28]  P. Boersma ACCURATE SHORT-TERM ANALYSIS OF THE FUNDAMENTAL FREQUENCY AND THE HARMONICS-TO-NOISE RATIO OF A SAMPLED SOUND , 1993 .

[29]  Wei Liu,et al.  The effect of age and native speaker status on synthetic speech intelligibility , 2013, SSW.

[30]  P. Ekman An argument for basic emotions , 1992 .

[31]  Paul Taylor,et al.  Text-to-Speech Synthesis , 2009 .

[32]  Lisa A. Moore Empathy: A Clinician’s Perspective , 2006 .

[33]  Friederike Eyssel,et al.  ‘If you sound like me, you must be more human’: On the interplay of robot and user features on human-robot acceptance and anthropomorphism , 2012, 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[34]  Florian Schiel,et al.  Signal processing via web services: The use case WebMAUS , 2012 .

[35]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[36]  Susan R. Fussell,et al.  How people anthropomorphize robots , 2008, 2008 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[37]  I-Ming Chen,et al.  A Review on the Use of Robots in Education and Young Children , 2016, J. Educ. Technol. Soc..

[38]  Ben J. A. Kröse,et al.  Assessing Acceptance of Assistive Social Agent Technology by Older Adults: the Almere Model , 2010, Int. J. Soc. Robotics.

[39]  Marcel Heerink,et al.  Exploring the influence of age, gender, education and computer experience on robot acceptance by older adults , 2011, 2011 6th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[40]  Cynthia Breazeal Emotive qualities in robot speech , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[41]  Bruce A. MacDonald,et al.  Acceptance of Healthcare Robots for the Older Population: Review and Future Directions , 2009, Int. J. Soc. Robotics.

[42]  Ilaria Torre,et al.  Can you Tell the Robot by the Voice? An Exploratory Study on the Role of Voice in the Perception of Robots , 2019, 2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[43]  Aaron Powers,et al.  Matching robot appearance and behavior to tasks to improve human-robot cooperation , 2003, The 12th IEEE International Workshop on Robot and Human Interactive Communication, 2003. Proceedings. ROMAN 2003..

[44]  Joelle Pineau,et al.  Towards robotic assistants in nursing homes: Challenges and results , 2003, Robotics Auton. Syst..

[45]  Catherine Pelachaud,et al.  A formal model of emotions for an empathic rational dialog agent , 2012, Autonomous Agents and Multi-Agent Systems.

[46]  Firoj Alam,et al.  Annotating and modeling empathy in spoken conversations , 2017, Comput. Speech Lang..

[47]  Bruce MacDonald,et al.  Artificial Empathy in Social Robots: An analysis of Emotions in Speech , 2018, 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN).

[48]  Daniel Cremers,et al.  SPENCER: A Socially Aware Service Robot for Passenger Guidance and Help in Busy Airports , 2015, FSR.

[49]  Cindy L. Bethel,et al.  A Survey of Using Vocal Prosody to Convey Emotion in Robot Speech , 2016, Int. J. Soc. Robotics.

[50]  Mike Thelwall,et al.  Seeing Stars of Valence and Arousal in Blog Posts , 2013, IEEE Transactions on Affective Computing.

[51]  Haizhou Li,et al.  Fundamental frequency modeling using wavelets for emotional voice conversion , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).

[52]  Sahil Jain Towards the Creation of Customised Synthetic Voices using Hidden Markov Models on a Healthcare Robot , 2015 .

[53]  Ben J. A. Kröse,et al.  The Influence of a Robot's Social Abilities on Acceptance by Elderly Users , 2006, ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication.

[54]  Elizabeth Broadbent,et al.  Perception of synthetic speech with emotion modelling delivered through a robot platform: an initial investigation with older listeners , 2010 .

[55]  Hansjörg Mixdorff,et al.  A novel approach to the fully automatic extraction of Fujisaki model parameters , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).