Perception of emotional valences and activity levels from vowel segments of continuous speech.

This study aimed to investigate the role of voice source and formant frequencies in the perception of emotional valence and psychophysiological activity level from short vowel samples (approximately 150 milliseconds). Nine professional actors (five males and four females) read a prose passage simulating joy, tenderness, sadness, anger, and a neutral emotional state. The stress carrying vowel [a:] was extracted from continuous speech during the Finnish word [ta:k:ahan] and analyzed for duration, fundamental frequency (F0), equivalent sound level (L(eq)), alpha ratio, and formant frequencies F1-F4. Alpha ratio was calculated by subtracting the L(eq) (dB) in the range 50 Hz-1 kHz from the L(eq) in the range 1-5 kHz. The samples were inverse filtered by Iterative Adaptive Inverse Filtering and the estimates of the glottal flow obtained were parameterized with the normalized amplitude quotient (NAQ = f(AC)/(d(peak)T)). Fifty listeners (mean age 28.5 years) identified the emotional valences from the randomized samples. Multinomial Logistic Regression Analysis was used to study the interrelations of the parameters for perception. It appeared to be possible to identify valences from vowel samples of short duration ( approximately 150 milliseconds). NAQ tended to differentiate between the valences and activity levels perceived in both genders. Voice source may not only reflect variations of F0 and L(eq), but may also have an independent role in expression, reflecting phonation types. To some extent, formant frequencies appeared to be related to valence perception but no clear patterns could be identified. Coding of valence tends to be a complicated multiparameter phenomenon with wide individual variation.

[1]  Paavo Alku,et al.  Time-domain parameterization of the closing phase of glottal airflow waveform from voices over a large intensity range , 2002, IEEE Trans. Speech Audio Process..

[2]  Roddy Cowie,et al.  Speech and Emotion , 2003, Speech Commun..

[3]  K. Izdebski,et al.  Letter: Vocal frequency and vertical larynx positioning by singers and nonsingers. , 1975, The Journal of the Acoustical Society of America.

[4]  Sheldon B. Michaels,et al.  Some Aspects of Fundamental Frequency and Envelope Amplitude as Related to the Emotional Content of Speech , 1962 .

[5]  K E Cummings,et al.  Analysis of the glottal excitation of emotionally styled and stressed speech. , 1995, The Journal of the Acoustical Society of America.

[6]  L. F. Barrett Are Emotions Natural Kinds? , 2006, Perspectives on psychological science : a journal of the Association for Psychological Science.

[7]  J Sundberg,et al.  Effects of subglottal pressure variation on professional baritone singers' voice sources. , 1999, The Journal of the Acoustical Society of America.

[8]  Qiguang Lin,et al.  Glottal source‐vocal tract acoustic interaction , 1987 .

[9]  P. Alku,et al.  Normalized amplitude quotient for parametrization of the glottal flow. , 2002, The Journal of the Acoustical Society of America.

[10]  J. Laver The phonetic description of voice quality , 1980 .

[11]  Paavo Alku,et al.  Glottal wave analysis with Pitch Synchronous Iterative Adaptive Inverse Filtering , 1991, Speech Commun..

[12]  Paavo Alku,et al.  On the perception of emotions in speech: the role of voice quality , 1997 .

[13]  No Value Proceedings of the 14th international congress of phonetic sciences , 2000 .

[14]  Klaus R. Scherer,et al.  Vocal communication of emotion: A review of research paradigms , 2003, Speech Commun..

[15]  C. Don Looking for Spinoza: Joy, Sorrow, and the Feeling Brain , 2004 .

[16]  Paavo Alku,et al.  Estimation of amplitude features of the glottal flow by inverse filtering speech pressure signals , 1998, Speech Commun..

[17]  P. Alku,et al.  Dynamic Extremes of Voice in the Light of Time Domain Parameters Extracted from the Amplitude Features of Glottal Flow and Its Derivative , 2002, Folia Phoniatrica et Logopaedica.

[18]  Ailbhe Ní Chasaide,et al.  The role of voice quality in communicating emotion, mood and attitude , 2003, Speech Commun..

[19]  Johan Sundberg,et al.  Simultaneous analysis of vocal fold vibration and transglottal airflow: exploring a new experimental setup. , 2003, Journal of voice : official journal of the Voice Foundation.

[20]  Roddy Cowie,et al.  Describing the emotional states that are expressed in speech , 2003, Speech Commun..

[21]  Joseph S. Perkell,et al.  Glottal airflow and transglottal air pressure measurements for male and female speakers in low, normal, and high pitch , 1989 .

[22]  P. Alku,et al.  Physical variations related to stress and emotional state: A preliminary study. , 1996 .

[23]  B. Kotchoubey,et al.  Recognition of affective prosody: continuous wavelet measures of event-related brain potentials to emotional exclamations. , 2004, Psychophysiology.

[24]  Paavo Alku,et al.  Emotions in Vowel Segments of Continuous Speech: Analysis of the Glottal Flow Using the Normalised Amplitude Quotient , 2006, Phonetica.

[25]  Iain R. Murray,et al.  Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. , 1993, The Journal of the Acoustical Society of America.

[26]  J. Sundberg,et al.  Effects on the glottal voice source of vocal loudness variation in untrained female and male voices. , 2005, The Journal of the Acoustical Society of America.

[27]  J. Sundberg,et al.  Spectral correlates of glottal voice source waveform characteristics. , 1989, Journal of speech and hearing research.

[28]  Branka Zei Pollermann A place for prosody in a unified model of cognition and emotion , 2002, Speech Prosody 2002.

[29]  Kim E. A. Silverman,et al.  Evidence for the independent function of intonation contour type, voice quality, and F0 range in signaling speaker affect , 1985 .

[30]  Paavo Alku,et al.  Emotions in [a]: A perceptual and acoustic study , 2006, Logopedics, phoniatrics, vocology.

[31]  K. Scherer,et al.  Acoustic profiles in vocal emotion expression. , 1996, Journal of personality and social psychology.

[32]  Gunnar Fant,et al.  Acoustic Theory Of Speech Production , 1960 .

[33]  G. Fant Acoustic theory of speech production : with calculations based on X-ray studies of Russian articulations , 1961 .

[34]  P. Alku,et al.  The role of F3 in the vocal expression of emotions , 2006, Logopedics, phoniatrics, vocology.

[35]  J. Perkell,et al.  Glottal airflow and transglottal air pressure measurements for male and female speakers in soft, normal, and loud voice. , 1988, The Journal of the Acoustical Society of America.

[36]  I R Titze,et al.  Acoustic impedance of an artificially lengthened and constricted vocal tract. , 2000, Journal of voice : official journal of the Voice Foundation.

[37]  J Sundberg,et al.  A STUDY OF THE EFFECTS OF SUBGLOTTAL PRESSURE , FUNDAMENTAL FREQUENCY AND MODE OF PHONATION ON THE VOICE SOURCE , 2007 .