MASC: A Speech Corpus in Mandarin for Emotion Analysis and Affective Speaker Recognition

In this paper, a large emotional speech database MASC (Mandarin affective speech corpus) is introduced. The database contains recordings of 68 native speakers (23 female and 45 male) and five kinds of emotional states: neutral, anger, elation, panic and sadness. Each speaker pronounces 5 phrases, 10 sentences for three times for each emotional states and 2 paragraphs only for neutral. These materials covers all the phonemes in Chinese. This corpus is constructed for prosodic and linguistic investigation of emotion expression in Mandarin. It can also be used for recognition of affectively stressed speakers. Furthermore, prosodic feature analysis and speaker recognition baseline experiment are performed on this database

[1]  K. Scherer,et al.  Vocal cues in emotion encoding and decoding , 1991 .

[2]  Ian C. Bruce,et al.  Robust Formant Tracking for Continuous Speech With Speaker Variability , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  K. Scherer,et al.  Acoustic profiles in vocal emotion expression. , 1996, Journal of personality and social psychology.

[4]  Klaus R. Scherer,et al.  Can automatic speaker verification be improved by training the algorithms on emotional speech? , 2000, INTERSPEECH.

[5]  R. Plutchik Emotion, a psychoevolutionary synthesis , 1980 .

[6]  Marc Schröder,et al.  Emotional speech synthesis: a review , 2001, INTERSPEECH.

[7]  Ian C. Bruce,et al.  Robust Formant Tracking for Continuous Speech With Speaker Variability , 2003, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Valery A. Petrushin,et al.  RUSLANA: a database of Russian emotional utterances , 2002, INTERSPEECH.

[9]  Xuejing Sun,et al.  Pitch determination and voice quality analysis using Subharmonic-to-Harmonic Ratio , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Ralf Kompe,et al.  Emotional space improves emotion recognition , 2002, INTERSPEECH.

[11]  Zhaohui Wu,et al.  Improving Speaker Recognition by Training on Emotion-Added Models , 2005, ACII.