Does Synthetic Voice alter Social Response to a Photorealistic Character in Virtual Reality?

In this paper, we investigate the effect of a realism mismatch in the voice and appearance of a photorealistic virtual character in virtual reality. While many studies have investigated voice attributes for robots, not much is known about the effect voice naturalness has on the perception of realistic virtual characters. We conducted an experiment in Virtual Reality (VR) with over two hundred participants investigating the mismatch between realistic appearance and unrealistic voice on the feeling of presence, and the emotional response of the user to the character expressing a strong negative emotion (sadness, guilt). We predicted that the mismatched voice would lower social presence and cause users to have a negative emotional reaction and feelings of discomfort towards the character. We found that the concern for the virtual character was indeed altered by the unnatural voice, though interestingly it did not affect social presence.

[1]  Matthew P. Aylett,et al.  Intelligent Virtual Agents , 2010, Lecture Notes in Computer Science.

[2]  Mary C. Whitton,et al.  A Psychophysical Experiment Regarding Components of the Plausibility Illusion , 2017, IEEE Transactions on Visualization and Computer Graphics.

[3]  J. Loomis,et al.  Interpersonal Distance in Immersive Virtual Environments , 2003, Personality & social psychology bulletin.

[4]  Greg Welch,et al.  A Systematic Review of Social Presence: Definition, Antecedents, and Implications , 2018, Front. Robot. AI.

[5]  Hideki Kawahara,et al.  Tandem-STRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Marco Liuni,et al.  DAVID: An open-source platform for real-time transformation of infra-segmental emotional cues in running speech , 2017, Behavior research methods.

[7]  Rachel McDonnell,et al.  The Effect of Realistic Appearance of Virtual Characters in Immersive Environments - Does the Character's Personality Play a Role? , 2018, IEEE Transactions on Visualization and Computer Graphics.

[8]  Rachel McDonnell,et al.  Human or Robot?: Investigating voice, appearance and gesture motion realism of conversational social agents , 2021, IVA.

[9]  Hideki Kawahara,et al.  Simplification and extension of non-periodic excitation source representations for high-quality speech manipulation systems , 2010, INTERSPEECH.

[10]  Jeremy N. Bailenson,et al.  Equilibrium Theory Revisited: Mutual Gaze and Personal Space in Virtual Environments , 2001, Presence: Teleoperators & Virtual Environments.

[11]  S. Baron-Cohen,et al.  The Cambridge Mindreading (CAM) Face-Voice Battery: Testing Complex Emotion Recognition in Adults with and without Asperger Syndrome , 2006, Journal of autism and developmental disorders.

[12]  Eva-Lotta Sallnäs,et al.  Haptic Feedback Increases Perceived Social Presence , 2010, EuroHaptics.

[13]  Heloir,et al.  The Uncanny Valley , 2019, The Animation Studies Reader.

[14]  Rachel McDonnell,et al.  Is Photorealism Important for Perception of Expressive Virtual Humans in Virtual Reality? , 2019, ACM Trans. Appl. Percept..

[15]  Mel Slater,et al.  Place illusion and plausibility can lead to realistic behaviour in immersive virtual environments , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[16]  Jennifer L. Gregg,et al.  The Networked Minds Measure of Social Presence : Pilot Test of the Factor Structure and Concurrent Validity , 2001 .

[17]  Paul Skalski,et al.  Image versus Sound: A Comparison of Formal Feature Effects on Presence and Video Game Enjoyment , 2010, PsychNology J..

[18]  H. Ishiguro,et al.  The thing that should not be: predictive coding and the uncanny valley in perceiving human and humanoid robot actions , 2011, Social cognitive and affective neuroscience.

[19]  Matthias Scheutz,et al.  A mismatch in the human realism of face and voice produces an uncanny valley , 2011, i-Perception.

[20]  Jun'ichiro Seyama,et al.  The Uncanny Valley: Effect of Realism on the Impression of Artificial Human Faces , 2007, PRESENCE: Teleoperators and Virtual Environments.

[21]  Heinrich H. Bülthoff,et al.  Render me real? , 2012, ACM Trans. Graph..

[22]  Rachel McDonnell,et al.  Don't stand so close to me: investigating the effect of control on the appeal of virtual humans using immersion and a proximity-based behavioral task , 2017, SAP.

[23]  Mark H. Davis Measuring individual differences in empathy: Evidence for a multidimensional approach. , 1983 .

[24]  E. Hall,et al.  The Hidden Dimension , 1970 .

[25]  Rachel McDonnell,et al.  Social presence and place illusion are affected by photorealism in embodied VR , 2019, MIG.

[26]  Matthew P. Aylett,et al.  Cross Modal Evaluation of High Quality Emotional Speech Synthesis with the Virtual Human Toolkit , 2016, IVA.

[27]  Benjamin R. Cowan,et al.  The Influence of Synthetic Voice on the Evaluation of a Virtual Character , 2017, INTERSPEECH.

[28]  Diego Gutierrez,et al.  To stylize or not to stylize? , 2015, ACM Trans. Graph..

[29]  GutierrezDiego,et al.  To stylize or not to stylize , 2015 .