Detection of distress in speech

A distress situation is reflected in a person's speech e.g., fear, or in the speech of those around him e.g., anger. For many real-life situations, it may be desired to detect distress by remote monitoring. This paper deals with such remote monitoring using a microphone. Namely, we propose a technique for speaker-independent detection of distress in speech. Different temporal and spectral acoustic features such as the Mel frequency cepstral coefficients (MFCC) and the Teager energy operator (TEO) are investigated and the most relevant ones for the task are selected using the ReliefF feature selection algorithm. We use these features to train an SVM classifier in order to differentiate between speech in a distress situation and speech in which no distress is presented. On the Berlin Database of Emotional Speech, the proposed technique achieves classification accuracies of 91% per utterance and 87.8% per time window.

[1]  Yixiong Pan,et al.  SPEECH EMOTION RECOGNITION USING SUPPORT VECTOR MACHINE , 2010 .

[2]  Marko Robnik-Sikonja,et al.  Overcoming the Myopia of Inductive Learning Algorithms with RELIEFF , 2004, Applied Intelligence.

[3]  Shashidhar G. Koolagudi,et al.  Emotion recognition from speech: a review , 2012, International Journal of Speech Technology.

[4]  John H. L. Hansen,et al.  Nonlinear feature based classification of speech under stress , 2001, IEEE Trans. Speech Audio Process..

[5]  Xiong Chen,et al.  Automatic Speech Emotion Recognition using Support Vector Machine , 2011, Proceedings of 2011 International Conference on Electronic & Mechanical Engineering and Information Technology.

[6]  Amit Sharma,et al.  Speech Emotion Recognition , 2015 .

[7]  L. Kaiser Communication of affects by single vowels , 1962, Synthese.

[8]  Michel Vacher,et al.  Speech recognition of aged voice in the AAL context: Detection of distress sentences , 2013, 2013 7th Conference on Speech Technology and Human - Computer Dialogue (SpeD).

[9]  Ilias Maglogiannis,et al.  Advanced Sound and Distress Speech Expression Classification for Human Status Awareness in Assistive Environments , 2009 .

[10]  Fakhri Karray,et al.  Speech Emotion Recognition using Gaussian Mixture Vector Autoregressive Models , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[11]  J. E. Rougui,et al.  Audio sound event identification for distress situations and context awareness , 2009, 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[12]  Lijiang Chen,et al.  Speech emotion recognition: Features and classification models , 2012, Digit. Signal Process..

[13]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[14]  Björn Schuller,et al.  Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[15]  H. Teager Some observations on oral air flow during phonation , 1980 .

[16]  Fakhri Karray,et al.  Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[17]  Theodoros Iliou,et al.  Comparison of Different Classifiers for Emotion Recognition , 2009, 2009 13th Panhellenic Conference on Informatics.