Obtaining speech assets for judgement analysis on low-pass filtered emotional speech

Investigating the emotional content in speech from acoustic characteristics requires separating the semantic content from the acoustic channel. For natural emotional speech, a widely used method to separate the two channels is the use of cue masking. Our objective is to investigate the use of cue masking in non-acted emotional speech by analyzing the extent to which filtering impacts the perception of emotional content of the modified speech material. However, obtaining a corpus of emotional speech can be quite difficult whereby verifying the emotional content is an issue thoroughly discussed. Currently, speech research is showing a tendency toward constructing corpora of natural emotion expression. In this paper we outline the procedure used to obtain the corpus containing high audio quality and ‘natural’ emotional speech. We review the use of Mood Induction Procedures which provides a method to obtain spontaneous emotional speech in a controlled environment. Following this, we propose an experiment to investigate the effects of cue masking on natural emotional speech.

[1]  Carlos Busso,et al.  Scripted dialogs versus improvisation: lessons learned about emotional elicitation techniques from the IEMOCAP database , 2008, INTERSPEECH.

[2]  Roddy Cowie,et al.  Emotional speech: Towards a new generation of databases , 2003, Speech Commun..

[3]  Tom Johnstone,et al.  Emotional speech elicited using computer games , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[4]  Stefan Steidl,et al.  Automatic classification of emotion related user states in spontaneous children's speech , 2009 .

[5]  J. Gross,et al.  Emotion elicitation using films , 1995 .

[6]  Kjell Elenius,et al.  Expression of affect in spontaneous speech: Acoustic correlates and automatic detection of irritation and resignation , 2011, Comput. Speech Lang..

[7]  Lori Lamel,et al.  Challenges in real-life emotion annotation and machine learning based detection , 2005, Neural Networks.

[8]  Hideki Kasuya,et al.  Constructing a spoken dialogue corpus for studying paralinguistic information in expressive conversation and analyzing its statistical/acoustic characteristics , 2011, Speech Commun..

[9]  F. Weiss,et al.  Anatomically based measurement of facial expressions in simulated versus hypnotically induced affect , 1987 .

[10]  P. Ekman,et al.  Handbook of methods in nonverbal behavior research , 1982 .

[11]  C. Cullen,et al.  Task-Based Mood Induction Procedures for the Elicitation of Natural Emotional Responses. , 2007 .

[12]  E. Velten A laboratory task for induction of mood states. , 1968, Behaviour research and therapy.

[13]  Klaus R. Scherer,et al.  Vocal communication of emotion , 2000 .

[14]  Allison Woodruff,et al.  Detecting user engagement in everyday conversations , 2004, INTERSPEECH.

[15]  C. Cullen,et al.  Generation of High Quality Audio Natural Emotional Speech Corpus using Task Based Mood Induction , 2006 .

[16]  F. Strack,et al.  Inhibiting and facilitating conditions of the human smile: a nonobtrusive test of the facial feedback hypothesis. , 1988, Journal of personality and social psychology.

[17]  N. Amir,et al.  Perceiving Prominence and Emotion in Speech - a Cross Lingual Study , 2004 .

[18]  F. Hesse,et al.  Relative effectiveness and validity of mood induction procedures : a meta-analysis , 1996 .

[19]  K. Scherer,et al.  Acoustic profiles in vocal emotion expression. , 1996, Journal of personality and social psychology.

[20]  K. Scherer,et al.  The New Handbook of Methods in Nonverbal Behavior Research , 2008 .

[21]  Björn W. Schuller,et al.  The INTERSPEECH 2009 emotion challenge , 2009, INTERSPEECH.

[22]  Noam Amir,et al.  Classifying emotions in speech: a comparison of methods , 2001, INTERSPEECH.

[23]  Bradley J. Ruffle Gift giving with emotions , 1999 .

[24]  E. Brunswik Perception and the Representative Design of Psychological Experiments , 1957 .

[25]  Maria Uther,et al.  Accepted Manuscript Running Head: Effects of Filtered Speech on Affect , 2022 .

[26]  R. Rosenthal MEASURING SENSITIVITY TO NONVERBAL COMMUNICATION: THE PONS TEST* , 1979 .

[27]  R. Rosenthal Sensitivity to Nonverbal Communication: The PONS Test , 1979 .

[28]  P. Niemi,et al.  Inducing affective states with success-failure manipulations: a meta-analysis. , 2004, Emotion.

[29]  K. Scherer Vocal affect expression: a review and a model for future research. , 1986, Psychological bulletin.

[30]  Constantine Kotropoulos,et al.  Emotional speech recognition: Resources, features, and methods , 2006, Speech Commun..

[31]  Loïc Kessous,et al.  The relevance of feature type for the automatic classification of emotional user states: low level descriptors and functionals , 2007, INTERSPEECH.

[32]  Loïc Kessous,et al.  Whodunnit - Searching for the most important feature types signalling emotion-related user states in speech , 2011, Comput. Speech Lang..

[33]  A. Göritz The Induction of Mood via the WWW , 2007 .

[34]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[35]  J. M. Kittross The measurement of meaning , 1959 .

[36]  Klaus R. Scherer,et al.  Vocal indicators of psychoactive drug effects , 1984, Speech Commun..

[37]  L. Boves,et al.  The effects of low-pass filtering and random splicing on the perception of speech , 1986, Journal of psycholinguistic research.

[38]  Roddy Cowie,et al.  FEELTRACE: an instrument for recording perceived emotion in real time , 2000 .

[39]  F. Hesse,et al.  Experimental inductions of emotional states and their effectiveness: A review , 1994 .

[40]  S. D. de l'Etoile The effect of a musical mood induction procedure on mood state-dependent word retrieval. , 2002, Journal of music therapy.

[41]  Klaus R. Scherer,et al.  Vocal communication of emotion: A review of research paradigms , 2003, Speech Commun..

[42]  Charlie Cullen,et al.  Emotional Speech Corpus Construction, Annotation and Distribution , 2008, LREC 2008.

[43]  K. Scherer,et al.  Cues and channels in emotion recognition. , 1986 .

[44]  K. Scherer,et al.  Affective speech elicited with a computer game. , 2005, Emotion.

[45]  N. Amir,et al.  Analysis of an emotional speech corpus in Hebrew based on objective criteria , 2000 .

[46]  Shrikanth S. Narayanan,et al.  Primitives-based evaluation and estimation of emotions in speech , 2007, Speech Commun..

[47]  Kornel Laskowski,et al.  Combining Efforts for Improving Automatic Classification of Emotional User States , 2006 .