Development and Analysis of Speech Emotion Corpus Using Prosodic Features for Cross Linguistics

In daily life communication, spoken language not only carries linguistic information but also conveys nonlinguistic information such as the speaker's emotions, gender, social status and age, etc. This paper introduces the Emotion-Pak corpus, a multilingual emotional speech database consisting of emotional sentences elicited in provincial languages of Pakistan: Urdu, Sindhi, Balochi, Punjabi and Pashto for observing the speech emotions present in acoustic signals.The proposed database is recorded from people (naive speakers)having different linguistics, ages, gender, education level and cultural backgrounds from different regions of Pakistan and cross linguistics compared with the Berlin database of emotional speech (EMO-DB)(actor speakers)in order to study whether emotions are gender and language dependent or independent using prosodic features. The statistical analysis of Emotion-Pak corpus shows that emotions with short duration and strong intensity (Anger and Happiness) or longer duration and weak intensity (Sadness and Comfort) have similar acoustic features. The subjective listening tests used for evaluating the quality of speech emotions in proposed emotions corpus. The subjective listening test results found quite similar with the results obtained from prosodic analysis of Emotion-Pak speech corpus.

[1]  Elmar Nöth,et al.  “You Stupid Tin Box” - Children Interacting with the AIBO Robot: A Cross-linguistic Emotional Speech Corpus , 2004, LREC.

[2]  C. Ratner,et al.  A Cultural-Psychological Analysis of Emotions , 2000 .

[3]  Hamido Fujita,et al.  An extraction of emotion in human speech using speech synthesize and classifiers for each emotion , 2008 .

[4]  K. Scherer,et al.  Cues and channels in emotion recognition. , 1986 .

[5]  Klaus R. Scherer,et al.  Acoustic correlates of task load and stress , 2002, INTERSPEECH.

[6]  W. Nick Campbell,et al.  Prosodic encoding of English speech , 1992, ICSLP.

[7]  J. Atkinson Correlation analysis of the physiological factors controlling fundamental voice frequency. , 1978, The Journal of the Acoustical Society of America.

[8]  Sonja A. Kotz,et al.  Recognizing Emotions in a Foreign Language , 2009 .

[9]  Constantine Kotropoulos,et al.  Emotional speech recognition: Resources, features, and methods , 2006, Speech Commun..

[10]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[11]  David I. Beaver,et al.  When Semantics Meets Phonetics: Acoustical Studies of Second-Occurrence Focus , 2007 .

[12]  M. Beckman Stress And Non-Stress Accent , 1986 .

[13]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[14]  Chiu-yu Tseng,et al.  Intensity in relation to prosody organization , 2004, 2004 International Symposium on Chinese Spoken Language Processing.

[15]  L. Anolli,et al.  The Voice of Emotion in Chinese and Italian Young Adults , 2008 .

[16]  P. Ekman An argument for basic emotions , 1992 .