Real-time speech emotion recognition by minimum number of features

Speech emotion recognition is an aspect of equipped robots with the human capabilities. The need to tradeoff between computational volume and performance accuracy is the main challenge of real-time processes. Application domain of this paper is robotic; therefore both mentioned factors are important. Selecting distinguishing factor with low dimension and high resolution is the optimal solution for both decreasing computational volume and increasing performance accuracy. In this paper a feature vector by minimum number of elements is proposed for recognizing emotional states of speech. The elements of proposed feature vector are selected from both prosodic and frequency features. The result of evaluation realizes the desired performance in the real-time processes.

[1]  Mansour Sheikhan,et al.  Using DTW neural–based MFCC warping to improve emotional speech recognition , 2011, Neural Computing and Applications.

[2]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[3]  Ragini Verma,et al.  Class-level spectral features for emotion recognition , 2010, Speech Commun..

[4]  Theodoros Kostoulas,et al.  Enhancing Emotion Recognition from Speech through Feature Selection , 2010, TSD.

[5]  Azam Bastanfard,et al.  Persian speech emotion recognition , 2015, 2015 7th Conference on Information and Knowledge Technology (IKT).

[6]  Mansour Sheikhan,et al.  Speech emotion recognition using FCBF feature selection method and GA-optimized fuzzy ARTMAP neural network , 2011, Neural Computing and Applications.

[7]  Emmanuel Dellandréa,et al.  Multi-stage classification of emotional speech motivated by a dimensional emotion model , 2009, Multimedia Tools and Applications.

[8]  Wei Xiao-peng,et al.  Survey on speech emotion recognition , 2009 .

[9]  Zdravko Kacic,et al.  Context-Independent Multilingual Emotion Recognition from Speech Signals , 2003, Int. J. Speech Technol..

[10]  Theodoros Iliou,et al.  SVM-MLP-PNN Classifiers on Speech Emotion Recognition Field - A Comparative Study , 2010, 2010 Fifth International Conference on Digital Telecommunications.

[11]  Anton Batliner,et al.  Speaker Characteristics and Emotion Classification , 2007, Speaker Classification.

[12]  George N. Votsis,et al.  Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..

[13]  Nasrollah Moghaddam Charkari,et al.  Multimodal information fusion application to human emotion recognition from face and speech , 2010, Multimedia Tools and Applications.

[14]  Shashidhar G. Koolagudi,et al.  Emotion recognition from speech: a review , 2012, International Journal of Speech Technology.

[15]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[16]  Yasunari Yoshitomi,et al.  Effect of sensor fusion for recognition of emotional states using voice, face image and thermal image of face , 2000, Proceedings 9th IEEE International Workshop on Robot and Human Interactive Communication. IEEE RO-MAN 2000 (Cat. No.00TH8499).

[17]  Yasunari Obuchi,et al.  Emotion Recognition using Mel-Frequency Cepstral Coefficients , 2007 .

[18]  R. A. Khan,et al.  MFCC and Prosodic Feature Extraction Techniques: A Comparative Study , 2012 .

[19]  Robert D. Rodman,et al.  Emotions in Speech: Juristic Implications , 2007, Speaker Classification.

[20]  Fakhri Karray,et al.  Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[21]  Jianhua Tao,et al.  Features Importance Analysis for Emotional Speech Classification , 2005, ACII.