The Face Speaks: Contextual and Temporal Sensitivity to Backchannel Responses

It is often assumed that one person in a conversation is active (the speaker) and the rest passive (the listeners). Conversational analysis has shown, however, that listeners take an active part in the conversation, providing feedback signals that can control conversational flow. The face plays a vital role in these backchannel responses. A deeper understanding of facial backchannel signals is crucial for many applications in social signal processing, including automatic modeling and analysis of conversations, or in the development of life-like, effective conversational agents. Here, we present results from two experiments testing the sensitivity to the context and the timing of backchannel responses. We utilised sequences from a newly recorded database of 5-minute, two-person conversations. Experiment 1 tested how well participants would be able to match backchannel sequences to their corresponding speaker sequence. On average, participants performed well above chance. Experiment 2 tested how sensitive participants would be to temporal misalignments of the backchannel sequence. Interestingly, participants were able to estimate the correct temporal alignment for the sequence pairs. Taken together, our results show that human conversational skills are highly tuned both towards context and temporal alignment, showing the need for accurate modeling of conversations in social signal processing.

[1]  H. Bülthoff,et al.  The contribution of different facial regions to the recognition of conversational expressions. , 2008, Journal of vision.

[2]  C. Wallraven,et al.  Dynamic information for the recognition of conversational expressions. , 2009, Journal of vision.

[3]  V. Yngve On getting a word in edgewise , 1970 .

[4]  Dirk Heylen,et al.  Bridging the Gap between Social Animal and Unsocial Machine: A Survey of Social Signal Processing , 2012, IEEE Transactions on Affective Computing.

[5]  Felix Wichmann,et al.  The psychometric function: I , 2001 .

[6]  P. Ekman Universals and cultural differences in facial expressions of emotion. , 1972 .

[7]  S. Baron-Cohen,et al.  Is There a "Language of the Eyes"? Evidence from Normal Adults, and Adults with Autism or Asperger Syndrome , 1997 .

[8]  K. Scherer,et al.  Studying the dynamics of emotional expression using synthesized facial muscle movements. , 2000, Journal of personality and social psychology.

[9]  Catherine Pelachaud,et al.  Subtleties of facial expressions in embodied agents , 2002, Comput. Animat. Virtual Worlds.

[10]  Daniel Gatica-Perez,et al.  Automatic nonverbal analysis of social interaction in small groups: A review , 2009, Image Vis. Comput..

[11]  Dirk Heylen,et al.  Backchannel Strategies for Artificial Listeners , 2010, IVA.

[12]  Christian Wallraven,et al.  Experimental Design: From User Studies to Psychophysics , 2011 .

[13]  Jay Pratt,et al.  Visual processing of targets can reduce saccadic latencies , 2005, Vision Research.

[14]  F A Wichmann,et al.  Ning for Helpful Comments and Suggestions. This Paper Benefited Con- Siderably from Conscientious Peer Review, and We Thank Our Reviewers the Psychometric Function: I. Fitting, Sampling, and Goodness of Fit , 2001 .

[15]  I. Poggi,et al.  Perception of non-verbal emotional listener feedback , 2006 .

[16]  J. Bavelas,et al.  Listener Responses as a Collaborative Process: The Role of Gaze , 2002 .

[17]  D. Heylen Challenges ahead: head movements and other social acts during conversations , 2005 .

[18]  Dirk Heylen,et al.  Backchannels: Quantity, Type and Timing Matters , 2011, IVA.