论文信息 - Cross-Lingual Vocal Emotion Recognition in Five Native Languages of Assam Using Eigenvalue Decomposition

Cross-Lingual Vocal Emotion Recognition in Five Native Languages of Assam Using Eigenvalue Decomposition

This work investigates whether vocal emotion expressions of full-blown discrete emotions can be recognized cross-lingually. This study will enable us to get more information regarding nature and function of emotion. Furthermore, this work will help in developing a generalized vocal emotion recognition system, which will increase the efficiency required for human-machine interaction systems. An emotional speech database was created with 140 simulated utterances (20 per emotion) per speaker, consisting of short sentences of six full-blown discrete basic emotions and one 'no-emotion' (i.e. neutral) in five native languages (not dialects) of Assam. A new feature set is proposed based on Eigenvalues of Autocorrelation Matrix (EVAM) of each frame of utterance. The Gaussian Mixture Model is used as classifier. The performance of EVAM feature set is compared at two sampling frequencies (44.1 kHz and 8.1 kHz) and with additive white noise with signal-to-noise ratios of 0 db, 5 db, 10 db and 20 db.

Aurobinda Routray | Aditya Bihar Kandali | Tapan Kumar Basu

[1] Wendy J. Holmes,et al. Speech Synthesis and Recognition , 1988 .

[2] Robert M. Gray,et al. An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[3] Rosalind W. Picard. Affective Computing , 1997 .

[4] T. Dalgleish,et al. Handbook of cognition and emotion , 1999 .

[5] Petri Laukka. Vocal Expression of Emotion Discrete-emotions and Dimensional Accounts , 2004 .

[6] George N. Votsis,et al. Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..

[7] Keinosuke Fukunaga,et al. Introduction to Statistical Pattern Recognition , 1972 .

[8] Philip Rose. Forensic Speaker Identification , 2002 .

[9] P. Laukka,et al. Communication of emotions in vocal expression and music performance: different channels, same code? , 2003, Psychological bulletin.

[10] W. M. Carey,et al. Digital spectral analysis: with applications , 1986 .

[11] Douglas A. Reynolds,et al. Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[12] K. Scherer,et al. Emotion Inferences from Vocal Expression Correlate Across Languages and Cultures , 2001 .

[13] K. Scherer,et al. Handbook of affective sciences. , 2003 .

[14] Keinosuke Fukunaga,et al. Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[15] K. Scherer,et al. Vocal expression of emotion. , 2003 .