Characterizing Emotion in the Soundtrack of an Animated Film: Credible or Incredible?

In this study we present a novel emotional speech corpus, consisting of dialog that was extracted from an animated film. This type of corpus presents an interesting compromise between the sparsity of emotion found in spontaneous speech, and the contrived emotion found in speech acted solely for research purposes. The dialog was segmented into 453 short units and judged for emotional content by native and non-native English speakers. Emotion was rated on two scales: Activation and Valence. Acoustic analysis gave a comprehensive set of 100 features covering F0, intensity, voice quality and spectrum. We found that Activation is more strongly correlated to our acoustic features than Valence. Activat-ion was correlated to several types of features, whereas Valence was correlated mainly to intensity related features. Further, ANOVA analysis showed some interesting contrasts between the two scales, and interesting differences in the judgments of native vs. non-native English speakers.

[1]  Noam Amir,et al.  Characteristics of authentic anger in hebrew speech , 2003, INTERSPEECH.

[2]  Diane J. Litman,et al.  Humor: Prosody Analysis and Automatic Recognition for F*R*I*E*N*D*S* , 2006, EMNLP.

[3]  Lori Lamel,et al.  Challenges in real-life emotion annotation and machine learning based detection , 2005, Neural Networks.

[4]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[5]  Julia Hirschberg,et al.  Classifying subject ratings of emotional speech using acoustic features , 2003, INTERSPEECH.

[6]  Chloé Clavel,et al.  Fear-type emotions of the SAFE Corpus: annotation issues , 2006, LREC.

[7]  Kornel Laskowski,et al.  Combining Efforts for Improving Automatic Classification of Emotional User States , 2006 .

[8]  George N. Votsis,et al.  Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..

[9]  J. A. Edwards,et al.  Talking data : transcription and coding in discourse research , 1995 .

[10]  Matthew Stone,et al.  Speaking with hands: creating animated conversational characters from recordings of human performance , 2004, SIGGRAPH 2004.

[11]  Sandra P. Whiteside,et al.  Simulated emotions: an acoustic study of voice and perturbation measures , 1998, ICSLP.

[12]  W. Chafe Discourse, Consciousness, and Time: The Flow and Displacement of Conscious Experience in Speaking and Writing , 1996 .

[13]  Iain R. Murray,et al.  Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. , 1993, The Journal of the Acoustical Society of America.

[14]  Roddy Cowie,et al.  Acoustic correlates of emotion dimensions in view of speech synthesis , 2001, INTERSPEECH.

[15]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.