Role of embodiment and presence in human perception of robots' facial cues

Abstract Both robotic and virtual agents could one day be equipped with social abilities necessary for effective and natural interaction with human beings. Although virtual agents are relatively inexpensive and flexible, they lack the physical embodiment present in robotic agents. Surprisingly, the role of embodiment and physical presence for enriching human-robot-interaction is still unclear. This paper explores how these unique features of robotic agents influence three major elements of human-robot face-to-face communication, namely the perception of visual speech, facial expression, and eye-gaze. We used a quantitative approach to disentangle the role of embodiment from the physical presence of a social robot, called Ryan, with three different agents (robot, telepresent robot, and virtual agent), as well as with an actual human. We used a robot with a retro-projected face for this study, since the same animation from a virtual agent could be projected to this robotic face, thus allowing comparison of the virtual agent’s animation behaviors with both telepresent and the physically present robotic agents. The results of our studies indicate that the eye gaze and certain facial expressions are perceived more accurately when the embodied agent is physically present than when it is displayed on a 2D screen either as a telepresent or a virtual agent. Conversely, we find no evidence that either the embodiment or the presence of the robot improves the perception of visual speech, regardless of syntactic or semantic cues. Comparison of our findings with previous studies also indicates that the role of embodiment and presence should not be generalized without considering the limitations of the embodied agents.

[1]  Wendy Ju,et al.  Animate Objects: How Physical Motion Encourages Public Interaction , 2010, PERSUASIVE.

[2]  A. Kendon,et al.  Organization of behavior in face-to-face interaction , 1975 .

[3]  Rolf Pfeifer,et al.  Understanding intelligence , 2020, Inequality by Design.

[4]  Robert C. Bilger,et al.  Standardization of a Test of Speech Perception in Noise , 1984 .

[5]  Christoph Bartneck,et al.  HCI and the Face: Towards an Art of the Soluble , 2007, HCI.

[6]  A. Kendon Some functions of gaze-direction in social interaction. , 1967, Acta psychologica.

[7]  Timothy D. Sweeny,et al.  The center of attention: Metamers, sensitivity, and bias in the emergent perception of gaze , 2017, Vision Research.

[8]  Takayuki Kanda,et al.  Footing in human-robot conversations: How robots might shape participant roles using gaze cues , 2009, 2009 4th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[9]  Chrystopher L. Nehaniv,et al.  Effects of Embodiment and Gestures on Social Interaction in Drumming Games with a Humanoid Robot , 2009, Adv. Robotics.

[10]  Kazuhito Yokoi,et al.  VocaWatcher: Natural singing motion generator for a humanoid robot , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  Marcia Grabowecky,et al.  Sounds exaggerate visual shape , 2012, Cognition.

[12]  Roxane J. Itier,et al.  Neural bases of eye and gaze processing: The core of social cognition , 2009, Neuroscience & Biobehavioral Reviews.

[13]  Ana Paiva,et al.  FearNot! demo: a virtual environment with synthetic characters to help bullying , 2007, AAMAS '07.

[14]  L. Cronbach Coefficient alpha and the internal structure of tests , 1951 .

[15]  K. Maruyama,et al.  Illusory face dislocation effect and configurational integration in the inverted face , 1985 .

[16]  Ronald A. Cole,et al.  Animating visible speech and facial expressions , 2004, The Visual Computer.

[17]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[18]  Chien-Chieh Huang,et al.  Human robot interactions using speech synthesis and recognition with lip synchronization , 2011, IECON 2011 - 37th Annual Conference of the IEEE Industrial Electronics Society.

[19]  N. Emery,et al.  The eyes have it: the neuroethology, function and evolution of social gaze , 2000, Neuroscience & Biobehavioral Reviews.

[20]  Arman Savran,et al.  Bosphorus Database for 3D Face Analysis , 2008, BIOID.

[21]  C. Bartneck,et al.  In your face, robot! The influence of a character's embodiment on how users perceive its emotional expressions , 2004 .

[22]  Cynthia Breazeal,et al.  Effect of a robot on user perceptions , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[23]  Gabriel Skantze,et al.  The furhat Back-Projected humanoid Head-Lip Reading, gaze and Multi-Party Interaction , 2013, Int. J. Humanoid Robotics.

[24]  Sang Ryong Kim,et al.  Are physically embodied social agents better than disembodied social agents?: The effects of physical embodiment, tactile interaction, and people's loneliness in human-robot interaction , 2006, Int. J. Hum. Comput. Stud..

[25]  Su-Ling Yeh,et al.  Look into my eyes and I will see you: Unconscious processing of human gaze , 2012, Consciousness and Cognition.

[26]  Kazuhiro Nakadai,et al.  PROT — An embodied agent for intelligible and user-friendly human-robot interaction , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[27]  Norman I. Badler,et al.  Look me in the Eyes: A Survey of Eye and Gaze Animation for Virtual Agents and Artificial Systems , 2014, Eurographics.

[28]  Chyi-Yeu Lin,et al.  Oral mechanism design on face robot for lip-synchronized speech , 2013, 2013 IEEE International Conference on Robotics and Automation.

[29]  A. J. Mistlin,et al.  Visual cells in the temporal cortex sensitive to face view and gaze direction , 1985, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[30]  A. Karmiloff-Smith,et al.  Are children with autism blind to the mentalistic significance of the eyes , 1995 .

[31]  S. Anstis,et al.  The perception of where a face or television "portrait" is looking. , 1969, The American journal of psychology.

[32]  Dave S. Kerby,et al.  The effect of head turn on the perception of gaze , 2009, Vision Research.

[33]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[34]  Jens Edlund,et al.  Taming Mona Lisa: Communicating gaze faithfully in 2D and 3D facial projections , 2012, TIIS.

[35]  H. Ishiguro,et al.  Geminoid: Teleoperated Android of an Existing Person , 2007 .

[36]  E. Vajda Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet , 2000 .

[37]  E. Goffman Behavior in Public Places , 1963 .

[38]  J. MacDonald,et al.  Hearing Lips and Seeing Voices: Illusion and Serendipity in Auditory‐Visual Perception Research , 2008 .

[39]  Gabriel Skantze,et al.  Furhat: A Back-Projected Human-Like Robot Head for Multiparty Human-Machine Interaction , 2011, COST 2102 Training School.

[40]  Andrew Faulkner,et al.  Lipreadability of a synthetic talking face in normal hearing and hearing-impaired listeners , 2003, AVSP.

[41]  Slim Ouni,et al.  Internationalization of a Talking Head , 2003 .

[42]  William Hyde Wollaston,et al.  XIII. On the apparent direction of eyes in a portrait , 1824, Philosophical Transactions of the Royal Society of London.

[43]  Gabriel Skantze,et al.  Perception of gaze direction for situated interaction , 2012, Gaze-In '12.

[44]  Frank Biocca,et al.  The Cyborg's Dilemma: Progressive Embodiment in Virtual Environments , 2006, J. Comput. Mediat. Commun..

[45]  Lee Sproull,et al.  Using a human face in an interface , 1994, CHI '94.

[46]  L. L. Elliott,et al.  Verbal auditory closure and the speech perception in noise (SPIN) Test. , 1995, Journal of speech and hearing research.

[47]  Illah R. Nourbakhsh,et al.  The role of expressiveness and attention in human-robot interaction , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[48]  David Whitney,et al.  Reference repulsion in the categorical perception of biological motion , 2012, Vision Research.

[49]  S. Langton,et al.  The influence of head contour and nose angle on the perception of eye-gaze direction , 2004, Perception & psychophysics.

[50]  Jun Rekimoto,et al.  LiveMask: a telepresence surrogate system with a face-shaped screen for supporting nonverbal communication , 2012, AVI.

[51]  Fumio Kishino,et al.  Augmented reality: a class of displays on the reality-virtuality continuum , 1995, Other Conferences.

[52]  Mikko Sams,et al.  The effect of dynamics on identifying basic emotions from synthetic and natural faces , 2008, Int. J. Hum. Comput. Stud..

[53]  Tony Belpaeme,et al.  A study of a retro-projected robotic face and its effectiveness for gaze reading by humans , 2010, HRI 2010.

[54]  Alain Berthoz,et al.  Head-eyes system and gaze analysis of the humanoid robot Romeo , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[55]  D. De Rossi,et al.  Can a Humanoid Face be Expressive? A Psychophysiological Investigation , 2015, Front. Bioeng. Biotechnol..

[56]  Shanyang Zhao,et al.  Toward a Taxonomy of Copresence , 2003, Presence: Teleoperators & Virtual Environments.

[57]  Susan R. Fussell,et al.  Anthropomorphic Interactions with a Robot and Robot–like Agent , 2008 .

[58]  Jamy Li,et al.  The benefit of being physically present: A survey of experimental works comparing copresent robots, telepresent robots and virtual agents , 2015, Int. J. Hum. Comput. Stud..

[59]  T. Kanda,et al.  Robot mediated round table: Analysis of the effect of robot's gaze , 2002, Proceedings. 11th IEEE International Workshop on Robot and Human Interactive Communication.

[60]  Stefan Kopp,et al.  A Conversational Agent as Museum Guide - Design and Evaluation of a Real-World Application , 2005, IVA.

[61]  Maja J. Mataric,et al.  Embodiment and Human-Robot Interaction: A Task-Based Perspective , 2007, RO-MAN 2007 - The 16th IEEE International Symposium on Robot and Human Interactive Communication.

[62]  Otsuka Yumiko,et al.  Dual route model of the effect of head orientation on perceived gaze direction , 2013 .

[63]  Cynthia Breazeal,et al.  Emotion and sociable humanoid robots , 2003, Int. J. Hum. Comput. Stud..

[64]  Bilge Mutlu,et al.  MACH: my automated conversation coach , 2013, UbiComp.

[65]  Daniel Bolaños The Bavieca open-source speech recognition toolkit , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[66]  Yuichiro Yoshikawa,et al.  Responsive Robot Gaze to Interaction Partner , 2006, Robotics: Science and Systems.

[67]  et al.,et al.  At the Virtual Frontier: Introducing Gunslinger, a Multi-Character, Mixed-Reality, Story-Driven Experience , 2009, IVA.

[68]  M. Cline The perception of where a person is looking. , 1967, The American journal of psychology.

[69]  Chan-Yul Jung,et al.  Real-time lip synchronization between text-to-speech (TTS) system and robot mouth , 2010, 19th International Symposium in Robot and Human Interactive Communication.

[70]  Kerstin Dautenhahn,et al.  Socially intelligent robots: dimensions of human–robot interaction , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[71]  P. Ekman,et al.  Facial action coding system: a technique for the measurement of facial movement , 1978 .

[72]  Ronald A. Cole,et al.  ExpressionBot: An emotive lifelike robotic face for face-to-face communication , 2014, 2014 IEEE-RAS International Conference on Humanoid Robots.

[73]  Dejan Todorović,et al.  Geometrical basis of perception of gaze direction , 2006, Vision Research.

[74]  T. Allison,et al.  Social perception from visual cues: role of the STS region , 2000, Trends in Cognitive Sciences.

[75]  Hiroshi Ishiguro,et al.  Evaluating facial displays of emotion for the android robot Geminoid F , 2011, 2011 IEEE Workshop on Affective Computational Intelligence (WACI).

[76]  L L Elliott,et al.  Development of a test of speech intelligibility in noise using sentence materials with controlled word predictability. , 1977, The Journal of the Acoustical Society of America.

[77]  Wendy A. Rogers,et al.  Understanding Robot Acceptance , 2011 .

[78]  P. Ekman,et al.  EMFACS-7: Emotional Facial Action Coding System , 1983 .

[79]  Brian Scassellati,et al.  The Benefits of Interactions with Physically Present Robots over Video-Displayed Agents , 2011, Int. J. Soc. Robotics.

[80]  W. M. Rabinowitz,et al.  Standardization of a test of speech perception in noise. , 1979, Journal of speech and hearing research.

[81]  James C. Lester,et al.  Deictic Believability: Coordinated Gesture, Locomotion, and Speech in Lifelike Pedagogical Agents , 1999, Appl. Artif. Intell..

[82]  Justine Cassell,et al.  Embodied conversational interface agents , 2000, CACM.

[83]  Kerstin Dautenhahn,et al.  The Art of Designing Socially Intelligent Agents: Science, Fiction, and the Human in the Loop , 1998, Appl. Artif. Intell..