Learning Backchannel Prediction Model from Parasocial Consensus Sampling: A Subjective Evaluation

Backchannel feedback is an important kind of nonverbal feedback within face-to-face interaction that signals a person's interest, attention and willingness to keep listening. Learning to predict when to give such feedback is one of the keys to creating natural and realistic virtual humans. Prediction models are traditionally learned from large corpora of annotated face-to-face interactions, but this approach has several limitations. Previously, we proposed a novel data collection method, Parasocial Consensus Sampling, which addresses these limitations. In this paper, we show that data collected in this manner can produce effective learned models. A subjective evaluation shows that the virtual human driven by the resulting probabilistic model significantly outperforms a previously published rule-based agent in terms of rapport, perceived accuracy and naturalness, and it is even better than the virtual human driven by real listeners' behavior in some cases.

[1]  Ning Wang,et al.  Creating Rapport with Virtual Agents , 2007, IVA.

[2]  Norman I. Badler,et al.  Creating Interactive Virtual Humans: Some Assembly Required , 2002, IEEE Intell. Syst..

[3]  J. Barker,et al.  The Role of GABA: Neurotrophic Activity of GABA During Development. , 1988, Science.

[4]  Julie A. Jacko HCI Intelligent multimodal interaction environments , 2007 .

[5]  J. Bailenson,et al.  Digital Chameleons , 2005, Psychological science.

[6]  S. Glotzer Some Assembly Required , 2004, Science.

[7]  Frank J. Bernieri,et al.  Dyad rapport and the accuracy of its judgment across situations: A lens model analysis. , 1996 .

[8]  Stacy Marsella,et al.  Virtual Rapport , 2006, IVA.

[9]  Jeremy N. Bailenson,et al.  The Effect of Behavioral Realism and Form Realism of Real-Time Avatar Faces on Verbal Disclosure, Nonverbal Disclosure, Emotion Recognition, and Copresence in Dyadic Interaction , 2006, PRESENCE: Teleoperators and Virtual Environments.

[10]  Michael Neff,et al.  Towards Natural Gesture Synthesis: Evaluating Gesture Units in a Data-Driven Approach to Gesture Synthesis , 2007, IVA.

[11]  Timothy W. Bickmore,et al.  Maintaining reality: Relational agents for antipsychotic medication adherence , 2010, Interact. Comput..

[12]  T. Millon,et al.  Personality and social psychology , 2003 .

[13]  Laura M. Pfeifer,et al.  Relational Agents for Antipsychotic Medication Adherence , 2007 .

[14]  J. Bavelas,et al.  Listeners as co-narrators. , 2000, Journal of personality and social psychology.

[15]  M. R. Levy,et al.  Watching TV news as para‐social interaction , 1979 .

[16]  J. Bavelas,et al.  Listener Responses as a Collaborative Process: The Role of Gaze , 2002 .

[17]  Louis-Philippe Morency,et al.  A probabilistic multimodal approach for predicting listener backchannels , 2009, Autonomous Agents and Multi-Agent Systems.

[18]  Ning Wang,et al.  Can Virtual Humans Be More Engaging Than Real Ones? , 2007, HCI.

[19]  Stacy Marsella,et al.  Learning models of speaker head nods with affective information , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[20]  J. Cassell,et al.  Embodied conversational agents , 2000 .

[21]  Mel Slater,et al.  Building Expression into Virtual Characters , 2006, Eurographics.

[22]  Jonathan Gratch,et al.  The effect of affective iconic realism on anonymous interactants' self-disclosure , 2009, CHI Extended Abstracts.

[23]  Dirk Heylen Understanding speaker-listener interactions , 2009, INTERSPEECH.

[24]  Louis-Philippe Morency,et al.  Parasocial consensus sampling: combining multiple perspectives to learn virtual human behavior , 2010, AAMAS.

[25]  D. Horton,et al.  Mass communication and para-social interaction; observations on intimacy at a distance. , 1956, Psychiatry.

[26]  Kristinn R. Thórisson,et al.  Learning Smooth, Human-Like Turntaking in Realtime Dialogue , 2008, IVA.

[27]  V. Yngve On getting a word in edgewise , 1970 .

[28]  Jonathan Gratch,et al.  Expression of Moral Emotions in Cooperating Agents , 2009, IVA.

[29]  R. Houlberg Local television news audience and the para‐social interaction , 1984 .

[30]  Ning Wang,et al.  Can Virtual Human Build Rapport and Promote Learning? , 2009, AIED.

[31]  Patrick Olivier,et al.  Proceedings of the 6th international conference on Intelligent Virtual Agents , 2006 .

[32]  Robert Gifford,et al.  A lens-mapping framework for understanding the encoding and decoding of interpersonal dispositions in nonverbal behavior. , 1994 .

[33]  A. Rubin,et al.  LONELINESS, PARASOCIAL INTERACTION, AND LOCAL TELEVISION NEWS VIEWING , 1985 .

[34]  Nigel G. Ward,et al.  Prosodic features which cue back-channel responses in English and Japanese , 2000 .

[35]  Louis-Philippe Morency,et al.  Predicting Listener Backchannels: A Probabilistic Multimodal Approach , 2008, IVA.

[36]  Dirk Heylen,et al.  Understanding Speaker-Listener Interaction , 2009 .

[37]  Stacy Marsella,et al.  Learning a model of speaker head nods using gesture corpora , 2009, AAMAS.

[38]  Trevor Darrell,et al.  Contextual recognition of head gestures , 2005, ICMI '05.