EmpathicSDS: Investigating Lexical and Acoustic Mimicry to Improve Perceived Empathy in Speech Dialogue Systems

In human-to-human conversations, showing empathy and thus understanding for the situation of the opposite party is crucial for a natural conversation. Thereby, emotional mimicry, i.e. imitating expressions of the person whom we are interacting with, is one of the basic mechanisms contributing to empathy. State-of-the-art speech dialogue systems still lack the ability of showing empathy, which limits naturalness. Thus, we present EmpathicSDS, a prototype to investigate the potential of lexical and acoustic mimicry for improving empathy in conversational interfaces. Our prototype comprises three different modes: 1.) neutral, where the system's response to a user query is static, 2.) lexical mimicry, where the wording of the user is reappraised by the system, and 3.) lexical and acoustic mimicry, which applies both lexical mimicry and a matching of the system's voice emotion to the user's emotional state. We conducted a user study with 33 participants to evaluate the effect of the mimicry approaches on user perception and to explore the role of user emotions. Our results show that lexical mimicry significantly improves perceived empathy and personalization without affecting efficiency. Acoustic mimicry can further improve naturalness in the condition of positive emotion while impairing efficiency in the negative condition.

[1]  Ana Paiva,et al.  Empathic Robots for Long-term Interaction , 2014, Int. J. Soc. Robotics.

[2]  Mitsuru Ishizuka,et al.  THE EMPATHIC COMPANION: A CHARACTER-BASED INTERFACE THAT ADDRESSES USERS' AFFECTIVE STATES , 2005, Appl. Artif. Intell..

[3]  E. Hatfield,et al.  Emotional Contagion and Empathy , 2009 .

[4]  Akshay Deepak,et al.  Synthesis of Emotional Speech by Prosody Modification of Vowel Segments of Neutral Speech , 2019 .

[5]  Swati Johar,et al.  Multimodality and Spoken Dialogue Systems , 2016 .

[6]  Karl F. MacDorman,et al.  The Uncanny Valley [From the Field] , 2012, IEEE Robotics Autom. Mag..

[7]  付伶俐 打磨Using Language,倡导新理念 , 2014 .

[8]  J. Holler,et al.  Co-Speech Gesture Mimicry in the Process of Collaborative Referring During Face-to-Face Dialogue , 2011 .

[9]  Ö. Yalçın Modeling Empathy in Embodied Conversational Agents: Extended Abstract , 2018, ICMI.

[10]  Ana Paiva,et al.  As Time goes by: Long-term evaluation of social presence in robotic companions , 2009, RO-MAN 2009 - The 18th IEEE International Symposium on Robot and Human Interactive Communication.

[11]  J. Cacioppo,et al.  Emotional Contagion: Current implications and suggestions for future research , 1993 .

[12]  Matthew P. Aylett,et al.  The CereVoice Characterful Speech Synthesiser SDK , 2007, IVA.

[13]  Winslow Burleson,et al.  Affective agents: Sustaining motivation to learn through failure and state of "stuck" , 2004 .

[14]  D. Watson,et al.  Development and validation of brief measures of positive and negative affect: the PANAS scales. , 1988, Journal of personality and social psychology.

[15]  Nicu Sebe,et al.  Affective multimodal human-computer interaction , 2005, ACM Multimedia.

[16]  Clifford Nass,et al.  Emotion regulation for frustrating driving contexts , 2011, CHI.

[17]  M. Pickering,et al.  The role of beliefs in lexical alignment: Evidence from dialogs with humans and computers , 2011, Cognition.

[18]  Heloir,et al.  The Uncanny Valley , 2019, The Animation Studies Reader.

[19]  Laura K. Taylor,et al.  Empathy: A Review of the Concept , 2016 .

[20]  Clifford Nass,et al.  Matching In-Car Voice with Driver State : Impact on Attitude and Driving Performance , 2017 .

[21]  M. Kret,et al.  Connecting minds and sharing emotions through mimicry: A neurocognitive model of emotional contagion , 2017, Neuroscience & Biobehavioral Reviews.

[22]  A. Schaefer,et al.  Please Scroll down for Article Cognition & Emotion Assessing the Effectiveness of a Large Database of Emotion-eliciting Films: a New Tool for Emotion Researchers , 2022 .

[23]  Stacy Marsella,et al.  A domain-independent framework for modeling emotion , 2004, Cognitive Systems Research.

[24]  M. Pickering,et al.  Linguistic alignment between people and computers , 2010 .

[25]  Clifford Nass,et al.  Improving automotive safety by pairing driver emotion and car voice emotion , 2005, CHI Extended Abstracts.

[26]  Timothy W. Bickmore,et al.  Establishing and maintaining long-term human-computer relationships , 2005, TCHI.

[27]  E. Hatfield,et al.  Emotional Contagion , 1995 .

[28]  Silke Anders,et al.  The multiple facets of empathy: a survey of theory and evidence. , 2006, Progress in brain research.

[29]  Khalil Sima'an,et al.  Wired for Speech: How Voice Activates and Advances the Human-Computer Relationship , 2006, Computational Linguistics.

[30]  Lirong Dai,et al.  Emotional statistical parametric speech synthesis using LSTM-RNNs , 2017, 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).

[31]  No'e Tits A Methodology for Controlling the Emotional Expressiveness in Synthetic Speech - a Deep Learning approach , 2019, 2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW).

[32]  Clifford Nass,et al.  Computers that care: investigating the effects of orientation of emotion exhibited by an embodied computer agent , 2005, Int. J. Hum. Comput. Stud..