A novel Adaptive Fractional Deep Belief Networks for speaker emotion recognition

Abstract Due to the rapid development of human computer interaction systems, the recognition of emotion becomes a challenging task. Various handheld devices such as smart phones and PCs are utilized to recognize the human emotion from the speech. But, the recognition of emotion is burdensome to the human computer interaction system since it differs according to the speaker. To resolve this problem, the Adaptive Fractional Deep Belief Network (AFDBN) is proposed in this paper. Initially, the spectral features are extracted from the input speech signal. The features obtained are the tonal power ratio, spectral flux, pitch chroma and MFCC. The extracted feature set is then given into the network for the classification. Thus, the AFDBN is newly designed by the fractional theory and Deep belief network. Then, the proposed AFDBN method is used to find out the optimal weights which are used to recognize the emotion efficiently. Finally, the experimental results are evaluated and its performance is analyzed by the evaluation metrics which is compared with the existing systems. The outcome of the proposed method attains 99.17% accuracy for Berlin database and 97.74% for Telugu database.

[1]  Roman Jarina,et al.  SVM based speaker emotion recognition in continuous scale , 2015, 2015 25th International Conference Radioelektronika (RADIOELEKTRONIKA).

[2]  Karthikeyan Umapathy,et al.  Feature analysis of pathological speech signals using local discriminant bases technique , 2006, Medical and Biological Engineering and Computing.

[3]  Yongzhao Zhan,et al.  Learning Salient Features for Speech Emotion Recognition Using Convolutional Neural Networks , 2014, IEEE Transactions on Multimedia.

[4]  Inma Hernáez,et al.  Feature Analysis and Evaluation for Automatic Emotion Identification in Speech , 2010, IEEE Transactions on Multimedia.

[5]  Fakhri Karray,et al.  Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[6]  Geoffrey E. Hinton Deep belief networks , 2009, Scholarpedia.

[7]  Ning An,et al.  Speech Emotion Recognition Using Fourier Parameters , 2015, IEEE Transactions on Affective Computing.

[8]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[9]  Diego H. Milone,et al.  Spoken emotion recognition using hierarchical classifiers , 2011, Comput. Speech Lang..

[10]  Theodoros Iliou,et al.  Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011 , 2012, Artificial Intelligence Review.

[11]  Preeti Rao,et al.  AUDIO SIGNAL CLASSIFICATION , 2004 .

[12]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Giovanni Costantini,et al.  Speech emotion recognition using amplitude modulation parameters and a combined feature selection procedure , 2014, Knowl. Based Syst..

[14]  Geoffroy Peeters Chroma-based estimation of musical key from audio-signal analysis , 2006, ISMIR.

[15]  K. Scherer VOCAL MEASUREMENT OF EMOTION , 1989 .

[16]  Maria Virvou,et al.  On assisting a visual-facial affect recognition system with keyboard-stroke pattern information , 2010, Knowl. Based Syst..

[17]  Yoon Keun Kwak,et al.  Improved Emotion Recognition With a Novel Speaker-Independent Feature , 2009, IEEE/ASME Transactions on Mechatronics.

[18]  Ragini Verma,et al.  Speaker-sensitive emotion recognition via ranking: Studies on acted and spontaneous speech , 2015, Comput. Speech Lang..

[19]  Chin Kim On,et al.  Mel-frequency cepstral coefficient analysis in speech recognition , 2006, 2006 International Conference on Computing & Informatics.

[20]  Alexander Lerch,et al.  An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics , 2012 .

[21]  Paulo Moura Oliveira,et al.  Particle swarm optimization with fractional-order velocity , 2010 .

[22]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.