A Review on Emotion Recognition Algorithms using Speech Analysis

In recent years, there is a growing interest in speech emotion recognition (SER) by analyzing input speech. SER can be considered as simply pattern recognition task which includes features extraction, classifier, and speech emotion database. The objective of this paper is to provide a comprehensive review on various literature available on SER. Several audio features are available, including linear predictive coding coefficients (LPCC), Mel-frequency cepstral coefficients (MFCC), and Teager energy based features. While for classifier, many algorithms are available including hidden Markov model (HMM), Gaussian mixture mdoel (GMM), vector quantization (VQ), artificial neural networks (ANN), and deep neural networks (DNN). In this paper, we also reviewed various speech emotion database. Finally, recent related works on SER using DNN will be discussed.

[1]  Amit Sharma,et al.  Speech Emotion Recognition , 2015 .

[2]  Margaret Lech,et al.  Effects of band reduction and coding on speech emotion recognition , 2016, 2016 10th International Conference on Signal Processing and Communication Systems (ICSPCS).

[3]  Dong Yu,et al.  Automatic Speech Recognition: A Deep Learning Approach , 2014 .

[4]  Rongfang Bie,et al.  Deep Learning Based Affective Model for Speech Emotion Recognition , 2016, 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld).

[5]  Teddy Surya Gunawan,et al.  On the Comparison of Line Spectral Frequencies and Mel-Frequency Cepstral Coefficients Using Feedforward Neural Network for Language Identification , 2018 .

[6]  Pankaj Sharma,et al.  Improved MFCC and LPC algorithm for bundelkhandi isolated digit speech recognition , 2016, 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT).

[7]  Wang Fei,et al.  Research on speech emotion recognition based on deep auto-encoder , 2016, 2016 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER).

[8]  Seiichi Nakagawa,et al.  Investigation of glottal features and annotation procedures for speech emotion recognition , 2016, 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA).

[9]  Li Dan,et al.  Speech recognition based on convolutional neural networks , 2016, 2016 IEEE International Conference on Signal and Image Processing (ICSIP).

[10]  S. S. Poorna,et al.  Emotion recognition using multi-parameter speech feature classification , 2015, 2015 International Conference on Computers, Communications, and Systems (ICCCS).

[11]  Sean Xin Xu,et al.  IT-Enabled Role Playing in Service Encounter: Design a Customer Emotion Management System in Call Centers , 2017, ICIS.

[12]  Uzzal Sharma,et al.  Bengali speech emotion recognition , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[13]  Mark Beale,et al.  Neural Network Toolbox™ User's Guide , 2015 .

[14]  D.R. Reddy,et al.  Speech recognition by machine: A review , 1976, Proceedings of the IEEE.

[15]  Rajneet Kaur,et al.  A Study of Speech Emotion Recognition Methods , 2013 .

[16]  T. Kishore Kumar,et al.  Stressed speech emotion recognition using feature fusion of teager energy operator and MFCC , 2017, 2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT).

[17]  M. Inés Torres,et al.  Analyzing the expression of annoyance during phone calls to complaint services , 2016, 2016 7th IEEE International Conference on Cognitive Infocommunications (CogInfoCom).

[18]  G. Sivaranjani,et al.  EMOTION RECOGNITION FROM SPEECH WITH GAUSSIAN MIXTURE MODELS AND VIA BOOSTED GMM , 2018 .

[19]  Zhong-Qiu Wang,et al.  Speech emotion recognition based on Gaussian Mixture Models and Deep Neural Networks , 2017, 2017 Information Theory and Applications Workshop (ITA).

[20]  Kosai Raoof,et al.  A review on speech emotion recognition: Case of pedagogical interaction in classroom , 2017, 2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP).

[21]  Aroor Dinesh Dileep,et al.  Speech emotion recognition using kernel sparse representation based classifier , 2016, 2016 24th European Signal Processing Conference (EUSIPCO).

[22]  Debajyoti Mukhopadhyay,et al.  Age driven automatic speech emotion recognition system , 2016, 2016 International Conference on Computing, Communication and Automation (ICCCA).

[23]  Sung Wook Baik,et al.  Speech Emotion Recognition from Spectrograms with Deep Convolutional Neural Network , 2017, 2017 International Conference on Platform Technology and Service (PlatCon).

[24]  Fakhri Karray,et al.  Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[25]  Dimitris Pappas,et al.  Anger detection in call center dialogues , 2015, 2015 6th IEEE International Conference on Cognitive Infocommunications (CogInfoCom).

[26]  Vassilis Digalakis,et al.  Speech Emotion Recognition using non-linear Teager energy based features in noisy environments , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[27]  Pravina P. Ladde,et al.  Use of Multiple Classifier System for Gender Driven Speech Emotion Recognition , 2015, 2015 International Conference on Computational Intelligence and Communication Networks (CICN).

[28]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[29]  Dong Yu,et al.  Speech emotion recognition using deep neural network and extreme learning machine , 2014, INTERSPEECH.

[30]  Eivind Kvedalen Signal processing using the Teager Energy Operator and other nonlinear operators , 2003 .

[31]  Kai Yu,et al.  Acoustic emotion recognition using deep neural network , 2014, The 9th International Symposium on Chinese Spoken Language Processing.

[32]  Björn W. Schuller,et al.  Speaker Independent Speech Emotion Recognition by Ensemble Classification , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[33]  Yu Zheng Chong,et al.  A conceptual emotion recognition framework: stress and anger analysis for car accidents , 2017 .

[34]  M. R. Anjum,et al.  Self learning speech recognition model using vector quantization , 2016, 2016 Sixth International Conference on Innovative Computing Technology (INTECH).

[35]  E. S. Gopi Digital Speech Processing Using Matlab , 2013 .

[36]  Agnes Jacob Speech emotion recognition based on minimal voice quality features , 2016, 2016 International Conference on Communication and Signal Processing (ICCSP).

[37]  J. F. Kaiser,et al.  On a simple algorithm to calculate the 'energy' of a signal , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[38]  Sunil Kumar Kopparapu,et al.  Improved speech emotion recognition using error correcting codes , 2016, 2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[39]  Teddy Surya Gunawan,et al.  Development of Quranic Reciter Identification System using MFCC and GMM Classifier , 2018 .