Speech emotion recognition approaches in human computer interaction

Speech Emotion Recognition (SER) represents one of the emerging fields in human-computer interaction. Quality of the human-computer interface that mimics human speech emotions relies heavily on the types of features used and also on the classifier employed for recognition. The main purpose of this paper is to present a wide range of features employed for speech emotion recognition and the acoustic characteristics of those features. Also in this paper, we analyze the performance in terms of some important parameters such as: precision, recall, F-measure and recognition rate of the features using two of the commonly used emotional speech databases namely Berlin emotional database and Danish emotional database. Emotional speech recognition is being applied in modern human-computer interfaces and the overview of 10 interesting applications is also presented in this paper to illustrate the importance of this technique.

[1]  Shivesh Ranjan,et al.  Exploring the Discrete Wavelet Transform as a Tool for Hindi Speech Recognition , 2010 .

[2]  A RajiSukumar.,et al.  Discrete Wavelet Transforms and Artificial Neural Networks for Speech Emotion Recognition , 2010 .

[3]  Adel El-Hennawy,et al.  Speech recognition using a wavelet transform to establish fuzzy inference system through subtractive clustering and neural network (ANFIS) , 2008, ICONS 2008.

[4]  Yuan-Pin Lin,et al.  EEG-Based Emotion Recognition in Music Listening , 2010, IEEE Transactions on Biomedical Engineering.

[5]  Zhongzhe Xiao,et al.  Automatic Hierarchical Classification of Emotional Speech , 2007, Ninth IEEE International Symposium on Multimedia Workshops (ISMW 2007).

[6]  Jing Cai,et al.  The Research on Emotion Recognition from ECG Signal , 2009, 2009 International Conference on Information Technology and Computer Science.

[7]  Björn W. Schuller,et al.  Context-sensitive multimodal emotion recognition from speech and facial expression using bidirectional LSTM modeling , 2010, INTERSPEECH.

[8]  Ling Guan,et al.  Recognizing Human Emotional State From Audiovisual Signals* , 2008, IEEE Transactions on Multimedia.

[9]  Björn W. Schuller,et al.  Combining Long Short-Term Memory and Dynamic Bayesian Networks for Incremental Emotion-Sensitive Artificial Listening , 2010, IEEE Journal of Selected Topics in Signal Processing.

[10]  Björn W. Schuller,et al.  Towards More Reality in the Recognition of Emotional Speech , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[11]  Kai-Tai Song,et al.  A New Information Fusion Method for Bimodal Robotic Emotion Recognition , 2008, J. Comput..

[12]  Sunil Kumar,et al.  Security on Mobile Agent Based Crawler (SMABC) , 2010 .

[13]  Xavier Rodet,et al.  INTERSPEECH 2008, 9th Annual Conference of the International Speech Communication Association, Brisbane, Australia, September 22-26, 2008 , 2008, INTERSPEECH.

[14]  Frédéric Béchet,et al.  Detection and Interpretation of Opinion Expressions in Spoken Surveys , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Simon King,et al.  IEEE Workshop on automatic speech recognition and understanding , 2009 .

[16]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2009, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Fabien Ringeval,et al.  Exploiting a Vowel Based Approach for Acted Emotion Recognition , 2008, COST 2102 Workshop.

[18]  Mingteh Chen,et al.  Applications of Support Vector Machines on Smart Phone Systems for Emotional Speech Recognition , 2010 .

[19]  Sadaoki Furui,et al.  International Speech Communication Association , 2006 .

[20]  Khiet P. Truong,et al.  Automatic Recognition of Spontaneous Emotions in Speech Using Acoustic and Lexical Features , 2008, MLMI.

[21]  Tom E. Bishop,et al.  Blind Image Restoration Using a Block-Stationary Signal Model , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[22]  Emmanuel Dellandréa,et al.  Recognition of emotions in speech by a hierarchical approach , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[23]  Kjell Elenius,et al.  Automatic recognition of anger in spontaneous speech , 2008, INTERSPEECH.

[24]  Björn W. Schuller,et al.  Acoustic emotion recognition: A benchmark comparison of performances , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[25]  Elliot Moore,et al.  Critical Analysis of the Impact of Glottal Features in the Classification of Clinical Depression in Speech , 2008, IEEE Transactions on Biomedical Engineering.

[26]  Björn W. Schuller,et al.  Combining speech recognition and acoustic word emotion models for robust text-independent emotion recognition , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[27]  Leontios J. Hadjileontiadis,et al.  Emotion Recognition From EEG Using Higher Order Crossings , 2010, IEEE Transactions on Information Technology in Biomedicine.

[28]  Björn W. Schuller,et al.  Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[29]  Peter Robinson,et al.  Classification of Complex Information: Inference of Co-Occurring Affective States from Their Expressions in Speech , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Farzin Deravi,et al.  A review of speech-based bimodal recognition , 2002, IEEE Trans. Multim..

[31]  Tetsunori Kobayashi,et al.  Spoken Dialogue System Using Prosody as Para-Linguistic Information , 2004 .

[32]  Ioannis Pitas,et al.  Automatic emotional speech classification , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[33]  Oh-Wook Kwon,et al.  Speech Emotion Recognition for Affective Human-Robot Interaction , 2006 .

[34]  Yoon Keun Kwak,et al.  Improved Emotion Recognition With a Novel Speaker-Independent Feature , 2009, IEEE/ASME Transactions on Mechatronics.

[35]  Johannes Wagner,et al.  Automatic Recognition of Emotions from Speech: A Review of the Literature and Recommendations for Practical Realisation , 2008, Affect and Emotion in Human-Computer Interaction.

[36]  K. B. Khanchandani,et al.  Emotion recognition using multilayer perceptron and generalized feed forward neural network , 2009 .

[37]  L. Cuddy,et al.  Emotional intelligence, not music training, predicts recognition of emotional speech prosody. , 2008, Emotion.

[38]  Shrikanth S. Narayanan,et al.  The Vera am Mittag German audio-visual emotional speech database , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[39]  John H. L. Hansen,et al.  Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition , 1996, Speech Commun..

[40]  Roddy Cowie,et al.  Emotional speech: Towards a new generation of databases , 2003, Speech Commun..

[41]  M. L. Dhore,et al.  SPEECH EMOTION RECOGNITION USING SUPPORT VECTOR MACHINE , 2010 .

[42]  Carlos Busso,et al.  Analysis of Emotionally Salient Aspects of Fundamental Frequency for Emotion Detection , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[43]  Te-Won Lee,et al.  A Spatio-Temporal Speech Enhance Speech Recogn , 2002 .

[44]  Björn W. Schuller,et al.  Emotion recognition from speech: Putting ASR in the loop , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[45]  Sazali Yaacob,et al.  EEG feature extraction for classifying emotions using FCM and FKM , 2008 .

[46]  Albino Nogueiras,et al.  Speech emotion recognition using hidden Markov models , 2001, INTERSPEECH.

[47]  Ibon Saratxaga,et al.  Emotion Conversion Based on Prosodic Unit Selection , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[48]  Constantine Kotropoulos,et al.  Fast and accurate sequential floating forward feature selection with the Bayes classifier applied to speech emotion recognition , 2008, Signal Process..

[49]  Sonja A. Kotz,et al.  How aging affects the recognition of emotional speech , 2008, Brain and Language.

[50]  Tanja Schultz,et al.  Towards an EEG-based emotion recognizer for humanoid robots , 2009, RO-MAN 2009 - The 18th IEEE International Symposium on Robot and Human Interactive Communication.

[51]  Charalampos Bratsas,et al.  On the Classification of Emotional Biosignals Evoked While Viewing Affective Pictures: An Integrated Data-Mining-Based Approach for Healthcare Applications , 2010, IEEE Transactions on Information Technology in Biomedicine.

[52]  Jeong-Sik Park,et al.  Feature vector classification based speech emotion recognition for service robots , 2009, IEEE Transactions on Consumer Electronics.