Speech Emotion Recognition Based on Parametric Filter and Fractal Dimension

In this paper, we propose a new method that employs two novel features, correlation density (Cd) and fractal dimension (Fd), to recognize emotional states contained in speech. The former feature obtained by a list of parametric filters reflects the broad frequency components and the fine structure of lower frequency components, contributed by unvoiced phones and voiced phones, respectively; the latter feature indicates the nonlinearity and self-similarity of a speech signal. Comparative experiments based on Hidden Markov Model and K Nearest Neighbor methods are carried out. The results show that Cd and Fd are much more closely related with emotional expression than the features commonly used. key words: speech emotion, parametric filter, correlation density, fractal dimension

[1]  Shrikanth Narayanan,et al.  Recognition of negative emotions from the speech signal , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[2]  Lianhong Cai,et al.  Speech emotion classification with the combination of statistic features and temporal features , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[3]  Wei Gang Speech Emotion Recognition Study Based on Short-term and Long-term Features , 2006 .

[4]  E. Vesterinen,et al.  Affective Computing , 2009, Encyclopedia of Biometrics.

[5]  Roddy Cowie,et al.  Automatic statistical analysis of the signal and prosodic signs of emotion in speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[6]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[7]  Tieniu Tan,et al.  Affective Computing: A Review , 2005, ACII.

[8]  Frank Dellaert,et al.  Recognizing emotion in speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.