Deep Learning Based Affective Model for Speech Emotion Recognition

Considering the application value of emotion, increasing attention has been attracted on emotion recognition over the last decades. We devote ourselves to feasible speech emotion recognition research. We build two affective model based on deep learning methods (stacked autoencoder network, deep belief network) for automatic salient emotion feature extraction, emotion states classification. The experiments are based on a well-known German Berlin Emotional Speech Database,, the recognition accuracy reaches 65% in the best case. In addition, we validate the influence of different speakers, different emotion categories on recognition accuracy.

[1]  Alessandra Russo,et al.  Speech Emotion Classification Using Machine Learning Algorithms , 2008, 2008 IEEE International Conference on Semantic Computing.

[2]  Takeo Kanade,et al.  Facial Expression Recognition , 2011, Handbook of Face Recognition.

[3]  Xue-wen Chen,et al.  Big Data Deep Learning: Challenges and Perspectives , 2014, IEEE Access.

[4]  Noor Aina Zaidan,et al.  A REVIEW ON SPEECH EMOTION FEATURES , 2015 .

[5]  Theodoros Iliou,et al.  Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011 , 2012, Artificial Intelligence Review.

[6]  Aurobinda Routray,et al.  Automatic facial expression recognition using features of salient facial patches , 2015, IEEE Transactions on Affective Computing.

[7]  Gang Wei,et al.  Speech emotion recognition based on HMM and SVM , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[8]  Shashidhar G. Koolagudi,et al.  Emotion recognition from speech: a review , 2012, International Journal of Speech Technology.

[9]  Qirong Mao,et al.  Speech emotion recognition with unsupervised feature learning , 2015, Frontiers of Information Technology & Electronic Engineering.

[10]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[11]  Yoshua Bengio,et al.  Learning deep physiological models of affect , 2013, IEEE Computational Intelligence Magazine.

[12]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[13]  Lijun Yin,et al.  FERA 2015 - second Facial Expression Recognition and Analysis challenge , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[14]  Vidhyasaharan Sethu,et al.  Speech Based Emotion Recognition , 2015 .

[15]  Ravi P. Ramachandran,et al.  Speech based emotion recognition using spectral feature extraction and an ensemble of kNN classifiers , 2014, The 9th International Symposium on Chinese Spoken Language Processing.

[16]  Honglak Lee,et al.  Deep learning for robust feature generation in audiovisual emotion recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17]  E. Vesterinen,et al.  Affective Computing , 2009, Encyclopedia of Biometrics.

[18]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[19]  E. M. Albornoz,et al.  Speech emotion recognition using a deep autoencoder , 2013 .

[20]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.