An audio-visual dataset of human–human interactions in stressful situations