Use of Different Features for Emotion Recognition Using MLP Network

Emotion recognition of human being is one of the major challenges in modern complicated world of political and criminal scenario. In this paper, an attempt is taken to recognise two classes of speech emotions as high arousal like angry and surprise and low arousal like sad and bore. Linear prediction coefficients (LPC), linear prediction cepstral coefficient (LPCC), Mel frequency cepstral coefficient (MFCC) and perceptual linear prediction (PLP) features are used for emotion recognition using multilayer perception (MLP).Various emotional speech features are extracted from audio channel using above-mentioned features to be used in training and testing. Two hundred utterances from ten subjects were collected based on four emotion categories. One hundred and seventy-five and twenty-five utterances have been considered for training and testing purpose.

[1]  Constantine Kotropoulos,et al.  Emotional speech recognition: Resources, features, and methods , 2006, Speech Commun..

[2]  Shrikanth S. Narayanan,et al.  Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.

[3]  Mohammad Masoud Javidi,et al.  Speech Emotion Recognition by Using Combinations of C5.0, Neural Network (NN), and Support Vector Machines (SVM) Classification Methods , 2013 .

[4]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[5]  Juan Carlos,et al.  Review of "Discrete-Time Speech Signal Processing - Principles and Practice", by Thomas Quatieri, Prentice-Hall, 2001 , 2003 .

[6]  J. G. Taylor,et al.  Emotion recognition in human-computer interaction , 2005, Neural Networks.

[7]  Mahesh Chandra,et al.  Design of Neural Network Model for Emotional Speech Recognition , 2015 .

[8]  Mann Oo. Hay Emotion recognition in human-computer interaction , 2012 .

[9]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[10]  Khaled Assaleh,et al.  Speaker Networks Recognition Using Neural and Conventional Classifiers , 1994 .

[11]  Aurobinda Routray,et al.  A non-rigid motion estimation algorithm for yawn detection in human drivers , 2009, Int. J. Comput. Vis. Robotics.

[12]  Emotion Recognition with Speech for Call Centres using LPC and Spectral Analysis , 2013 .

[13]  Richard J. Mammone,et al.  Speaker recognition using neural networks and conventional classifiers , 1994, IEEE Trans. Speech Audio Process..

[14]  Mihir Narayan Mohanty,et al.  On the Use of MFCC Feature Vector Clustering for Efficient Text Dependent Speaker Recognition , 2013, FICTA.