Time-resolved discrimination of audio-visual emotion expressions

Humans seamlessly extract and integrate the emotional content delivered by the face and the voice of others. It is however poorly understood how perceptual decisions unfold in time when people discriminate the expression of emotions transmitted using dynamic facial and vocal signals, as in natural social context. In this study, we relied on a gating paradigm to track how the recognition of emotion expressions across the senses unfold over exposure time. We first demonstrate that across all emotions tested, a discriminatory decision is reached earlier with faces than with voices. Importantly, multisensory stimulation consistently reduced the required accumulation of perceptual evidences needed to reach correct discrimination (Isolation Point). We also observed that expressions with different emotional content provide cumulative evidence at different speeds, with "fear" being the expression with the fastest isolation point across the senses. Finally, the lack of correlation between the confusion patterns in response to facial and vocal signals across time suggest distinct relations between the discriminative features extracted from the two signals. Altogether, these results provide a comprehensive view on how auditory, visual and audiovisual information related to different emotion expressions accumulate in time, highlighting how multisensory context can fasten the discrimination process when minimal information is available.

[1]  J. Vroomen,et al.  Multisensory integration of emotional faces and voices in schizophrenics , 2005, Schizophrenia Research.

[2]  Oliver G. B. Garrod,et al.  Facial expressions of emotion are not culturally universal , 2012, Proceedings of the National Academy of Sciences.

[3]  Stefania Bracci,et al.  Avoiding illusory effects in representational similarity analysis: What (not) to do with the diagonal , 2017, NeuroImage.

[4]  W. Marslen-Wilson Functional parallelism in spoken word-recognition , 1987, Cognition.

[5]  Katherine Vytal,et al.  Neuroimaging Support for Discrete Neural Correlates of Basic Emotions: A Voxel-based Meta-analysis , 2010, Journal of Cognitive Neuroscience.

[6]  Oliver G. B. Garrod,et al.  Dynamic Facial Expressions of Emotion Transmit an Evolving Hierarchy of Signals over Time , 2014, Current Biology.

[7]  P. Ekman,et al.  Facial Expressions of Emotion , 1979 .

[8]  Philipp Sterzer,et al.  Rapid Fear Detection Relies on High Spatial Frequencies , 2014, Psychological science.

[9]  Jeff Miller,et al.  Divided attention: Evidence for coactivation with redundant signals , 1982, Cognitive Psychology.

[10]  H. Nusbaum,et al.  Visual cortex entrains to sign language , 2017, Proceedings of the National Academy of Sciences.

[11]  Rachael E. Jack,et al.  Cultural Confusions Show that Facial Expressions Are Not Universal , 2009, Current Biology.

[12]  Manfred Herrmann,et al.  The Perception of Dynamic and Static Facial Expressions of Happiness and Disgust Investigated by ERPs and fMRI Constrained Source Analysis , 2013, PloS one.

[13]  A. Ohman,et al.  The face in the crowd revisited: a threat advantage with schematic stimuli. , 2001, Journal of personality and social psychology.

[14]  P. Ekman,et al.  A new pan-cultural facial expression of emotion , 1986 .

[15]  P. Ekman,et al.  Facial action coding system: a technique for the measurement of facial movement , 1978 .

[16]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[17]  L K Tyler,et al.  Is gating an on-line task? Evidence from naming latency data , 1985, Perception & psychophysics.

[18]  William D Marslen-Wilson,et al.  Processing interactions and lexical access during word recognition in continuous speech , 1978, Cognitive Psychology.

[19]  Matthew H. Davis,et al.  Leading Up the Lexical Garden Path: Segmentation and Ambiguity in Spoken Word Recognition , 2002 .

[20]  Neil A. Macmillan,et al.  Detection Theory: A User's Guide , 1991 .

[21]  J. Swets,et al.  A decision-making theory of visual detection. , 1954, Psychological review.

[22]  Pierre Rainville,et al.  Brain responses to dynamic facial expressions of pain , 2006, Pain.

[23]  R. Dolan,et al.  Crossmodal binding of fear in voice and face , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[24]  O. Collignona,et al.  Women process multisensory emotion expressions more efficiently than men , 2009 .

[25]  C. Izard,et al.  Four- and six-month-old infants' visual responses to joy, anger, and neutral expressions. , 1976, Child development.

[26]  H. Rosenfeld,et al.  Infant discrimination of facial expressions. , 1977 .

[27]  F. Gosselin,et al.  Audio-visual integration of emotion expression , 2008, Brain Research.

[28]  N. Etcoff,et al.  Facial expression megamix: Tests of dimensional and , 1997 .

[29]  Giulia V. Elli,et al.  Hierarchical Brain Network for Face and Voice Integration of Emotion Expression. , 2018, Cerebral cortex.

[30]  R. Blake,et al.  Fearful expressions gain preferential access to awareness during continuous flash suppression. , 2007, Emotion.

[31]  Joan López-Moliner,et al.  quickpsy: An R Package to Fit Psychometric Functions for Multiple Groups , 2016, R J..

[32]  D. Massaro,et al.  The temporal distribution of information in audiovisual spoken-word identification , 2010, Attention, perception & psychophysics.

[33]  J. Bachorowski,et al.  Acoustic correlates of talker sex and individual talker identity are present in a short vowel segment produced in running speech. , 1999, The Journal of the Acoustical Society of America.

[34]  S. Soto-Faraco,et al.  The Time Course of Audio-Visual Phoneme Identification: a High Temporal Resolution Study. , 2018, Multisensory research.

[35]  Sophie K. Scott,et al.  Cross-cultural recognition of basic emotions through nonverbal emotional vocalizations , 2010, Proceedings of the National Academy of Sciences.

[36]  A. Anderson,et al.  Neural Correlates of the Automatic Processing of Threat Facial Signals , 2022 .

[37]  Jeff Miller,et al.  Timecourse of coactivation in bimodal divided attention , 1986, Perception & psychophysics.

[38]  Thomas U. Otto,et al.  Multisensory Decisions: the Test of a Race Model, Its Logic, and Power , 2017 .

[39]  F. Gosselin,et al.  The Montreal Affective Voices: A validated set of nonverbal affect bursts for research on auditory affective processing , 2008, Behavior research methods.

[40]  F Grosjean,et al.  Spoken word recognition processes and the gating paradigm , 1980, Perception & psychophysics.

[41]  D. Perrett,et al.  Facial expression megamix: Tests of dimensional and category accounts of emotion recognition , 1997, Cognition.

[42]  M. Ernst,et al.  Humans integrate visual and haptic information in a statistically optimal fashion , 2002, Nature.

[43]  Jayna M. Amting,et al.  Multiple Mechanisms of Consciousness: The Neural Correlates of Emotional Awareness , 2010, The Journal of Neuroscience.

[44]  Swann Pichon,et al.  Perceiving fear in dynamic body expressions , 2007, NeuroImage.

[45]  U. Eysel,et al.  Neural structures associated with recognition of facial expressions of basic emotions , 1998, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[46]  R. Gur,et al.  Facial emotion recognition in schizophrenia: intensity effects and error pattern. , 2003, The American journal of psychiatry.

[47]  A. Sahraie,et al.  Awareness of faces is modulated by their emotional meaning. , 2006, Emotion.

[48]  M. Lassonde,et al.  Multilevel alterations in the processing of audio–visual emotion expressions in autism spectrum disorders , 2013, Neuropsychologia.

[49]  P. Ekman Universals and cultural differences in facial expressions of emotion. , 1972 .

[50]  Roman Feiman,et al.  Expressing fear enhances sensory acquisition , 2008, Nature Neuroscience.

[51]  Leading Up the Lexical Garden Path: Segmentation and Ambiguity in Spoken Word Recognition , 2002 .