Detection of time-pressure induced stress in speech via acoustic indicators

We use automatically extracted acoustic features to detect speech which is generated under stress, achieving 76.24% accuracy with a binary logistic regression. Our data are task-oriented human-human dialogues in which a time-limit is unexpectedly introduced partway through. Analysis suggests that we can detect approximately when this event occurs. We also consider the importance of normalizing the acoustic features by speaker, and detecting stress in new speakers.

[1]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[2]  Ales Prokes,et al.  Statistical analysis of glottal pulses in speech under psychological stress , 2008, 2008 16th European Signal Processing Conference.

[3]  Christian Müller,et al.  Speaker Classification I: Fundamentals, Features, and Methods , 2007, Speaker Classification.

[4]  John H. L. Hansen,et al.  Nonlinear feature based classification of speech under stress , 2001, IEEE Trans. Speech Audio Process..

[5]  John H. L. Hansen,et al.  Feature analysis and neural network-based classification of speech under stress , 1996, IEEE Trans. Speech Audio Process..

[6]  Günther Palm,et al.  Emotion Recognition from Speech: Stress Experiment , 2008, LREC.

[7]  John H. L. Hansen,et al.  Speech Under Stress: Analysis, Modeling and Recognition , 2007, Speaker Classification.

[8]  Matthias Scheutz,et al.  The Indiana “Cooperative Remote Search Task” (CReST) Corpus , 2010, LREC.

[9]  P. Lieberman,et al.  Fundamental frequency of phonation and perceived emotional stress. , 1997, The Journal of the Acoustical Society of America.

[10]  John H. L. Hansen,et al.  Nonlinear speech analysis and acoustic model adaptation with applications to stress classification and speech recognition , 1999 .

[11]  K. Scherer,et al.  Effect of experimentally induced stress on vocal parameters. , 1986, Journal of experimental psychology. Human perception and performance.

[12]  Alessandra Russo,et al.  Multistyle classification of speech under stress using feature subset selection based on genetic algorithms , 2007, Speech Commun..

[13]  P. Boersma Praat : doing phonetics by computer (version 5.1.05) , 2009 .

[14]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[15]  John H. L. Hansen,et al.  Getting started with SUSAS: a speech under simulated and actual stress database , 1997, EUROSPEECH.

[16]  J. F. Kaiser,et al.  On a simple algorithm to calculate the 'energy' of a signal , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[17]  John H. L. Hansen,et al.  Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition , 1996, Speech Commun..