Speech emotion recognition based on an improved brain emotion learning model

Abstract Human-robot emotional interaction has developed rapidly in recent years, in which speech emotion recognition plays a significant role. In this paper, a speech emotion recognition method based on an improved brain emotional learning (BEL) model is proposed, which is inspired by the emotional processing mechanism of the limbic system in the brain. The reinforcement learning rule of BEL model, however, makes it have poor adaptation and affects its performance. To solve these problems, Genetic Algorithm (GA) is employed to update the weights of BEL model. The proposal is tested on the CASIA Chinese emotion corpus, SAVEE emotion corpus, and FAU Aibo dataset, in which MFCC related features and their 1st order delta coefficients are extracted. In addition, the proposal is tested on INTERSPEECH 2009 standard feature set, in which three dimensionality reduction methods of Linear Discriminant Analysis (LDA), Principal Component Analysis (PCA), and PCA+LDA are used to reduce the dimension of feature set. The experimental results show that the proposed method obtains average recognition accuracy of 90.28% (CASIA), 76.40% (SAVEE), and 71.05% (FAU Aibo) for speaker-dependent (SD) speech emotion recognition and the highest average accuracy of 38.55% (CASIA), 44.18% (SAVEE), 64.60% (FAU Aibo) for speaker-independent (SI) speech emotion recognition are obtained, which shows that the proposal is feasible in speech emotion recognition.

[1]  Arti Rawat,et al.  Emotion Recognition through Speech Using Neural Network , 2015 .

[2]  Ning An,et al.  Speech Emotion Recognition Using Fourier Parameters , 2015, IEEE Transactions on Affective Computing.

[3]  Junwei Gao,et al.  Short-term traffic flow forecasting model of optimized BP neural network based on genetic algorithm , 2013, Proceedings of the 32nd Chinese Control Conference.

[4]  Jason Gu,et al.  Neo-Fuzzy Supported Brain Emotional Learning Based Pattern Recognizer for Classification Problems , 2017, IEEE Access.

[5]  Mohamed E. El-Hawary,et al.  Neo-Fuzzy Integrated Adaptive Decayed Brain Emotional Learning Network for Online Time Series Prediction , 2017, IEEE Access.

[6]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[7]  Kah Phooi Seng,et al.  A new approach of audio emotion recognition , 2014, Expert Syst. Appl..

[8]  Saeid Nahavandi,et al.  Wind power forecasting using emotional neural networks , 2014, 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[9]  Zhang-Quan Shen,et al.  Optimizing Weights by Genetic Algorithm for Neural Network Ensemble , 2004, ISNN.

[10]  Joseph E. LeDoux,et al.  Emotion and the limbic system concept , 1991 .

[11]  Javier G. Rázuri,et al.  Speech emotion recognition in emotional feedback for Human-Robot Interaction , 2015 .

[12]  Haitao Zhao,et al.  An efficient unconstrained facial expression recognition algorithm based on Stack Binarized Auto-encoders and Binarized Neural Networks , 2017, Neurocomputing.

[13]  Shashidhar G. Koolagudi,et al.  SVM Scheme for Speech Emotion Recognition using MFCC Feature , 2013 .

[14]  Ehsan Lotfi,et al.  Adaptive brain emotional decayed learning for online prediction of geomagnetic activity indices , 2014, Neurocomputing.

[15]  Min Wu,et al.  A multimodal emotional communication based humans-robots interaction system , 2016, 2016 35th Chinese Control Conference (CCC).

[16]  Min Wu,et al.  A facial expression emotion recognition based human-robot interaction system , 2017, IEEE/CAA Journal of Automatica Sinica.

[17]  Jun-Wei Mao,et al.  Speech emotion recognition based on feature selection and extreme learning machine decision tree , 2018, Neurocomputing.

[18]  Babak Nadjar Araabi,et al.  Brain emotional learning based intelligent controller applied to neurofuzzy model of micro-heat exchanger , 2007, Expert Syst. Appl..

[19]  E. Rolls A Theory of Emotion, and its Application to Understanding the Neural Basis of Emotion , 1990 .

[20]  Han Wen Review on Speech Emotion Recognition , 2014 .

[21]  Ehsan Lotfi,et al.  BRAIN EMOTIONAL LEARNING-BASED PATTERN RECOGNIZER , 2013, Cybern. Syst..

[22]  Ehsan Lotfi,et al.  Supervised brain emotional learning , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[23]  Guanzheng Tan,et al.  An Improved Brain-Inspired Emotional Learning Algorithm for Fast Classification , 2017, Algorithms.

[24]  Ehsan Lotfi,et al.  A winner-take-all approach to emotional neural networks with universal approximation property , 2015, Inf. Sci..

[25]  Ehsan Lotfi,et al.  Practical emotional neural networks , 2014, Neural Networks.

[26]  Björn W. Schuller,et al.  Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge , 2011, Speech Commun..

[27]  Joan Claudi Socoró,et al.  GTM-URL contribution to the INTERSPEECH 2009 emotion challenge , 2009, INTERSPEECH.

[28]  Saeed Setayeshi,et al.  Speech emotion recognition based on a modified brain emotional learning model , 2017, BICA 2017.

[29]  Lalit M. Patnaik,et al.  Genetic algorithms: a survey , 1994, Computer.

[30]  Björn Schuller,et al.  Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[31]  K. YogeshC.,et al.  Hybrid BBO_PSO and higher order spectral features for emotion and stress recognition from natural speech , 2017, Appl. Soft Comput..

[32]  Wolfgang Minker,et al.  Speech-Based Emotion Recognition: Feature Selection by Self-Adaptive Multi-Criteria Genetic Algorithm , 2014, LREC.

[33]  J. Morén,et al.  A computational model of emotional learning in the amygdala. , 2000 .

[34]  Lloyd Watts,et al.  Reverse-Engineering the Human Auditory Pathway , 2012, WCCI.

[35]  Joseph E LeDoux The emotional brain , 1996 .

[36]  Makoto Mizukawa,et al.  Emotion Recognition Based on ECG Signals for Service Robots in the Intelligent Space During Daily Life , 2011, J. Adv. Comput. Intell. Intell. Informatics.

[37]  Philip J. B. Jackson,et al.  Speaker-dependent audio-visual emotion recognition , 2009, AVSP.

[38]  Margaret Lech,et al.  On the Correlation and Transferability of Features Between Automatic Speech Recognition and Speech Emotion Recognition , 2016, INTERSPEECH.

[39]  Yanning Zhang,et al.  Hybrid Deep Neural Network--Hidden Markov Model (DNN-HMM) Based Speech Emotion Recognition , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[40]  Ehsan Lotfi,et al.  A Neural Basis Computational Model of Emotional Brain for Online Visual Object Recognition , 2014, Appl. Artif. Intell..

[41]  Christian Balkenius,et al.  EMOTIONAL LEARNING: A COMPUTATIONAL MODEL OF THE AMYGDALA , 2001, Cybern. Syst..

[42]  Weishan Zhang,et al.  Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN , 2017, Sensors.

[43]  Margaret Lech,et al.  Evaluating deep learning architectures for Speech Emotion Recognition , 2017, Neural Networks.

[44]  Mohammad Pooyan,et al.  An optimum algorithm in pathological voice quality assessment using wavelet-packet-based features, linear discriminant analysis and support vector machine , 2012, Biomed. Signal Process. Control..

[45]  Peng Song,et al.  Cross-corpus speech emotion recognition based on transfer non-negative matrix factorization , 2016, Speech Commun..

[46]  D. Poeppel,et al.  The cortical organization of speech processing , 2007, Nature Reviews Neuroscience.

[47]  Björn W. Schuller,et al.  Feature enhancement by deep LSTM networks for ASR in reverberant multisource environments , 2014, Comput. Speech Lang..