Embedding Conversational Agents into AR: Invisible or with a Realistic Human Body?

Currently, (invisible) smart speech assistants, such as Siri, Alexa, and Cortana, are used by a constantly growing number of people. Moreover, Augmented Reality (AR) glasses are predicted to become widespread consumer devices in the future. Hence, smart assistants can easily become common applications of AR glasses, which allows for giving the assistant a visual representation as an embodied agent. While previous research on embodied agents found a user preference for a humanoid appearance, research on the uncanny valley suggests that simply designed humanoids can be favored over hyper-realistic humanoid characters. In a user study, we compared agents of simple versus more realistic appearance (seen through AR glasses) versus an invisible state-of-the-art speech assistants (see Figure 1). Our results indicate that a more realistic visualization is preferred as it provides additional communication cues, such as eye contact and gaze, which seem to be key features when talking to a smart assistant. But if the situation requires visual attention, e.g., when being mobile or in a multitask situation, an invisible agent can be more appropriate as they do not distract the visual focus, which can be essential during AR experiences.

[1]  Isaac Wang,et al.  Exploring Virtual Agents for Augmented Reality , 2019, CHI.

[2]  Christos Mousas,et al.  The effects of appearance and motion of virtual characters on emotional reactivity , 2018, Comput. Hum. Behav..

[3]  Dirk Heylen,et al.  Making agents gaze naturally - does it work? , 2002, AVI '02.

[4]  Norman I. Badler,et al.  A Review of Eye Gaze in Virtual Agents, Social Robotics and HCI: Behaviour Generation, User Interaction and Perception , 2015, Comput. Graph. Forum.

[5]  Manfred Tscheligi,et al.  "Look!": using the gaze direction of embodied agents , 2007, CHI.

[6]  Jeremy N. Bailenson,et al.  A meta-analysis of the impact of the inclusion and realism of human-like faces on user experiences in interfaces , 2007, CHI.

[7]  Hideyuki Tamura,et al.  Welbo: an embodied conversational agent living in mixed reality space , 2000, CHI Extended Abstracts.

[8]  Niels Henze,et al.  Avoiding the uncanny valley in virtual character design , 2018, Interactions.

[9]  Akinori Ito,et al.  A spoken dialogue system using virtual conversational agent with augmented reality , 2012, Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference.

[10]  Manfred Tscheligi,et al.  Interacting with embodied agents that can see: how vision-enabled agents can assist in spatial tasks , 2006, NordiCHI '06.

[11]  Heloir,et al.  The Uncanny Valley , 2019, The Animation Studies Reader.

[12]  J. Schwartz,et al.  Seeing to hear better: evidence for early audio-visual interactions in speech identification , 2004, Cognition.

[13]  Ronald Azuma,et al.  A Survey of Augmented Reality , 1997, Presence: Teleoperators & Virtual Environments.

[14]  Ronald Azuma,et al.  A survey of augmented reality" Presence: Teleoperators and virtual environments , 1997 .

[15]  Niels Henze,et al.  Determining the Characteristics of Preferred Virtual Faces Using an Avatar Generator , 2015, CHI PLAY.

[16]  Anton Nijholt,et al.  How the agent's gender influence users' evaluation of a QA system , 2010, 2010 International Conference on User Science and Engineering (i-USEr).

[17]  Mervyn A. Jack,et al.  Evaluating humanoid synthetic agents in e-retail applications , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[18]  Dirk Heylen,et al.  Gaze behavior of talking faces makes a difference , 2002, CHI Extended Abstracts.

[19]  Marc Fabri,et al.  Emotionally Expressive Avatars for Chatting, Learning and Therapeutic Intervention , 2007, HCI.

[20]  P. Milgram,et al.  A Taxonomy of Mixed Reality Visual Displays , 1994 .

[21]  Mervyn A. Jack,et al.  Experimental assessment of the effectiveness of synthetic personae for multi-modal e-retail applications , 2000, AGENTS '00.

[22]  John Zimmerman,et al.  How interface agents affect interaction between humans and computers , 2007, DPPI.

[23]  Hung-Hsuan Huang,et al.  Embodied Conversational Agents , 2009 .

[24]  Q. Summerfield,et al.  Lipreading and audio-visual speech perception. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[25]  Angela Tinwell Uncanny as Usability Obstacle , 2009, HCI.

[26]  H. Brenton,et al.  The Uncanny Valley : does it exist ? , 2005 .

[27]  T. Koda,et al.  Agents with faces: the effect of personification , 1996, Proceedings 5th IEEE International Workshop on Robot and Human Communication. RO-MAN'96 TSUKUBA.

[28]  Stefan Kopp,et al.  The Effects of an Embodied Conversational Agent's Nonverbal Behavior on User's Evaluation and Behavioral Mimicry , 2007, IVA.

[29]  Dieter Schmalstieg,et al.  How real should virtual characters be? , 2006, ACE '06.

[30]  Karl F. MacDorman,et al.  The Uncanny Valley [From the Field] , 2012, IEEE Robotics Autom. Mag..

[31]  Andrew J. Cowell,et al.  Embodiment and Interaction Guidelines for Designing Credible, Trustworthy Embodied Conversational Agents , 2003, IVA.

[32]  Martin Breidt,et al.  Face reality: investigating the Uncanny Valley for virtual faces , 2010, SIGGRAPH ASIA.

[33]  John Zimmerman,et al.  Putting a Face on Embodied Interface Agents , 2005 .

[34]  Marc Hassenzahl,et al.  The Inference of Perceived Usability From Beauty , 2010, Hum. Comput. Interact..

[35]  J. B. Brooke,et al.  SUS: A 'Quick and Dirty' Usability Scale , 1996 .

[36]  Michal Ponder,et al.  Augmented reality for real and virtual humans , 2000, Proceedings Computer Graphics International 2000.

[37]  Michael Burmester,et al.  AttrakDiff: Ein Fragebogen zur Messung wahrgenommener hedonischer und pragmatischer Qualität , 2003, MuC.

[38]  Rachel McDonnell,et al.  The Effect of Realistic Appearance of Virtual Characters in Immersive Environments - Does the Character's Personality Play a Role? , 2018, IEEE Transactions on Visualization and Computer Graphics.

[39]  Lee Sproull,et al.  My partner is a real dog: cooperation with social agents , 1996, CSCW '96.

[40]  W. H. Sumby,et al.  Visual contribution to speech intelligibility in noise , 1954 .

[41]  Paul Milgram,et al.  A taxonomy of (real and virtual world) display and control interactions , 2009, VRST '09.

[42]  Sean Andrist,et al.  Looking Coordinated: Bidirectional Gaze Mechanisms for Collaborative Interaction with Virtual Characters , 2017, CHI.