Stress Detection from Audio on Multiple Window Analysis Size in a Public Speaking Task

Speech production modifications are one of the many indications of stress in humans. A job interview simulation task permitted the collection of a multimodal corpus, including physiological data. Physiological cues of stress are reliable on long periods, and require invasive sensors. Human voice variations have been proved to be a non-invasive stress cue. In this paper, we focus on a frame-wise detection of stress on several window analysis sizes and analyze the behavior of different audio features classes. We trained our system on 19 subjects, and test it on 10 other subjects. Our best system obtains a 71.9 percent Unweighted Average Recall on 5s windows.

[1]  M. Tahon,et al.  Analyse acoustique de la voix émotionnelle de locuteurs lors d’une interaction humain-robot , 2012 .

[2]  C. Kirschbaum,et al.  The 'Trier Social Stress Test'--a tool for investigating psychobiological stress responses in a laboratory setting. , 1993, Neuropsychobiology.

[3]  Rosalind W. Picard,et al.  Modeling drivers' speech under stress , 2003, Speech Commun..

[4]  A Guell,et al.  Voice analysis to predict the psychological or physical state of a speaker. , 1990, Aviation, space, and environmental medicine.

[5]  Xi Li,et al.  Stress and Emotion Classification using Jitter and Shimmer Features , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[6]  Günther Palm,et al.  Emotion Recognition from Speech: Stress Experiment , 2008, LREC.

[7]  Daniel Gatica-Perez,et al.  StressSense: detecting stress in unconstrained acoustic environments using smartphones , 2012, UbiComp.

[8]  Chris Barker,et al.  An Experiment on Public Speaking Anxiety in Response to Three Different Types of Virtual Audience , 2002, Presence: Teleoperators & Virtual Environments.

[9]  G. Rigoll,et al.  Acoustic Emotion Recognition in Car Environment Using a 3D Emotion Space Approach , 2007 .

[10]  John H. L. Hansen,et al.  Getting started with SUSAS: a speech under simulated and actual stress database , 1997, EUROSPEECH.

[11]  Jean-Claude Martin,et al.  Multimodal Expressions of Stress during a Public Speaking Task: Collection, Annotation and Global Analyses , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[12]  Bernard Harmegnies,et al.  Time- and spectrum-related variabilities in stressed speech under laboratory and real conditions , 1996, Speech Commun..

[13]  Peter Robinson,et al.  Real-Time Recognition of Affective States from Nonverbal Features of Speech and Its Application for Public Speaking Skill Analysis , 2011, IEEE Transactions on Affective Computing.