Robust emotion recognition feature, frequency range of meaningful signal

Although the literature in emotion recognition from voice emphasizes that the recognition of emotions is generally classified in term of primary (or basic) emotions. However, they fail to explain the rationale for their classification. In addition, for the more exact recognition, more features to classify emotion are needed. But there are only a few features such as energy, pitch, and tempo. Hence, rather than using primary emotions, we classify emotions in emotional groups that have the same emotional state. We also propose a new feature called the frequency range of meaningful signal for emotion recognition from voice. In contrast to other features, this feature is independent of the magnitude of a speech signal and it is robust in a noisy environment. We also confirm the usefulness of this proposed feature through recognition experiments.

[1]  R. H. Myers,et al.  Probability and Statistics for Engineers and Scientists , 1978 .

[2]  Josef Kittler A Method for Determining Class Subspaces , 1977, Inf. Process. Lett..

[3]  J. Markel,et al.  The SIFT algorithm for fundamental frequency estimation , 1972 .

[4]  Xiao Lin,et al.  Recognition of emotional state from spoken sentences , 1999, 1999 IEEE Third Workshop on Multimedia Signal Processing (Cat. No.99TH8451).

[5]  Atsuo Takanishi,et al.  Robot personalization based on the mental dynamics , 2000, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113).

[6]  Albino Nogueiras,et al.  Speech emotion recognition using hidden Markov models , 2001, INTERSPEECH.

[7]  R. Plutchik Emotions and Life: Perspectives from Psychology, Biology, and Evolution , 2002 .