Is there a dominant channel in perception of emotions?

The objective of this study was to determine whether one perceptually dominant channel in carrying emotional cues could be determined among speech, textual content and facial expression. To this end a Wizard-Of-Oz type scenario was used to elicit a corpus of emotional speech and facial expressions from five female speakers. Excerpts from this corpus were then presented to 48 listeners in the various modalities: audio only, video only, text only and video+audio. Listeners judged emotional content on two scales: Activation and Valence. Most listeners rated the combined modality easiest to judge and video alone as most difficult. Statistical analysis of the judgments revealed that Activation was more difficult to judge than Valence. Furthermore, the best agreement between judgments of Valence was obtained between judgments based on audio alone, text alone, and the combined channel, indicating that textual content had a major and indeed dominant influence on the judgments.

[1]  Louis ten Bosch,et al.  Emotions, speech and the ASR framework , 2003, Speech Commun..

[2]  Roddy Cowie,et al.  FEELTRACE: an instrument for recording perceived emotion in real time , 2000 .

[3]  K. Stevens,et al.  Emotions and speech: some acoustical correlates. , 1972, The Journal of the Acoustical Society of America.

[4]  Noam Amir,et al.  Characterizing Emotion in the Soundtrack of an Animated Film: Credible or Incredible? , 2007, ACII.

[5]  J. G. Taylor,et al.  Emotion recognition in human-computer interaction , 2005, Neural Networks.

[6]  L. Nygaard,et al.  Communicating emotion: linking affective prosody and word meaning. , 2008, Journal of experimental psychology. Human perception and performance.

[7]  R. Frick Communicating emotion: The role of prosodic features. , 1985 .

[8]  D. Lundqvist,et al.  Facial expressions of emotion (KDEF): Identification under different display-duration conditions , 2008, Behavior research methods.

[9]  Roddy Cowie,et al.  Beyond emotion archetypes: Databases for emotion modelling using neural networks , 2005, Neural Networks.

[10]  George N. Votsis,et al.  Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..

[11]  K. Scherer,et al.  Vocal cues in emotion encoding and decoding , 1991 .

[12]  M. Coltheart,et al.  Photographs of facial expression: Accuracy, response times, and ratings of intensity , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[13]  P. Laukka,et al.  A dimensional approach to vocal expression of emotion , 2005 .

[14]  Lori Lamel,et al.  Annotation and Detection of Emotion in a Task-oriented Human-Human Dialog Corpus , 2007 .

[15]  Elmar Nöth,et al.  How to find trouble in communication , 2003, Speech Commun..

[16]  Klaus R. Scherer,et al.  Vocal communication of emotion , 2000 .

[17]  Roddy Cowie,et al.  Describing the emotional states that are expressed in speech , 2003, Speech Commun..

[18]  F. Gosselin,et al.  Audio-visual integration of emotion expression , 2008, Brain Research.

[19]  N. Amir,et al.  Perceiving Prominence and Emotion in Speech - a Cross Lingual Study , 2004 .

[20]  J. A. Edwards,et al.  Talking data : transcription and coding in discourse research , 1995 .

[21]  Iain R. Murray,et al.  Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. , 1993, The Journal of the Acoustical Society of America.

[22]  K. Scherer,et al.  Multimodal expression of emotion: affect programs or componential appraisal patterns? , 2007, Emotion.

[23]  Lori Lamel,et al.  Challenges in real-life emotion annotation and machine learning based detection , 2005, Neural Networks.

[24]  K. Scherer,et al.  Acoustic profiles in vocal emotion expression. , 1996, Journal of personality and social psychology.