Processing communicative facial and vocal cues in the superior temporal sulcus

Facial and vocal cues provide critical social information about other humans, including their emotional and attentional states and the content of their speech. Recent work has shown that the face-responsive region of posterior superior temporal sulcus ("fSTS") also responds strongly to vocal sounds. Here, we investigate the functional role of this region and the broader STS by measuring responses to a range of face movements, vocal sounds, and hand movements using fMRI. We find that the fSTS responds broadly to different types of audio and visual face action, including both richly social communicative actions, as well as minimally social noncommunicative actions, ruling out hypotheses of specialization for processing speech signals, or communicative signals more generally. Strikingly, however, responses to hand movements were very low, whether communicative or not, indicating a specific role in the analysis of face actions (facial and vocal), not a general role in the perception of any human action. Furthermore, spatial patterns of response in this region were able to decode communicative from noncommunicative face actions, both within and across modality (facial/vocal cues), indicating sensitivity to an abstract social dimension. These functional properties of the fSTS contrast with a region of middle STS that has a selective, largely unimodal auditory response to speech sounds over both communicative and noncommunicative vocal nonspeech sounds, and nonvocal sounds. Region of interest analyses were corroborated by a data-driven independent component analysis, identifying face-voice and auditory speech responses as dominant sources of voxelwise variance across the STS. These results suggest that the STS contains separate processing streams for the audiovisual analysis of face actions and auditory speech processing.

[1]  Rebecca Saxe,et al.  Mentalizing regions represent distributed, continuous, and abstract dimensions of others’ beliefs , 2017, NeuroImage.

[2]  David A. Medler,et al.  Cerebral Cortex doi:10.1093/cercor/bhi040 Cerebral Cortex Advance Access published February 9, 2005 , 2022 .

[3]  N. Logothetis,et al.  A voice region in the monkey brain , 2008, Nature Neuroscience.

[4]  Pascal Belin,et al.  Is voice processing species-specific in human auditory cortex? An fMRI study , 2004, NeuroImage.

[5]  Nancy Kanwisher,et al.  Functional Organization of Social Perception and Cognition in the Superior Temporal Sulcus , 2015, Cerebral cortex.

[6]  Kevin A. Pelphrey,et al.  Grasping the Intentions of Others: The Perceived Intentionality of an Action Influences Activity in the Superior Temporal Sulcus during Social Perception , 2004, Journal of Cognitive Neuroscience.

[7]  D. Reisberg,et al.  Easy to hear but hard to understand: A lip-reading advantage with intact auditory stimuli. , 1987 .

[8]  R. Desimone,et al.  Visual properties of neurons in a polysensory area in superior temporal sulcus of the macaque. , 1981, Journal of neurophysiology.

[9]  Michael S. Beauchamp,et al.  Mouth and Voice: A Relationship between Visual and Auditory Preference in the Human Superior Temporal Sulcus , 2017, The Journal of Neuroscience.

[10]  R. Saxe,et al.  Parts‐based representations of perceived face movements in the superior temporal sulcus , 2019, Human brain mapping.

[11]  Russell A Poldrack,et al.  Precision Neuroscience: Dense Sampling of Individual Brains , 2017, Neuron.

[12]  Katharina von Kriegstein,et al.  Mechanisms of enhancing visual–speech recognition by prior auditory information , 2013, NeuroImage.

[13]  W. Freiwald,et al.  Face Processing Systems: From Neurons to Real-World Social Perception. , 2016, Annual review of neuroscience.

[14]  A. O'Toole,et al.  Recognizing moving faces: a psychological and neural synthesis , 2002, Trends in Cognitive Sciences.

[15]  E. Redcay The superior temporal sulcus performs a common function for social and speech perception: Implications for the emergence of autism , 2008, Neuroscience & Biobehavioral Reviews.

[16]  S. Scott,et al.  Identification of a pathway for intelligible speech in the left temporal lobe. , 2000, Brain : a journal of neurology.

[17]  Athena Vouloumanos,et al.  The Superior Temporal Sulcus Differentiates Communicative and Noncommunicative Auditory Signals , 2012, Journal of Cognitive Neuroscience.

[18]  R. Goebel,et al.  Integration of Letters and Speech Sounds in the Human Brain , 2004, Neuron.

[19]  J. Haxby,et al.  Distinct representations of eye gaze and identity in the distributed human neural system for face perception , 2000, Nature Neuroscience.

[20]  R Saxe,et al.  People thinking about thinking people The role of the temporo-parietal junction in “theory of mind” , 2003, NeuroImage.

[21]  T. Allison,et al.  Temporal Cortex Activation in Humans Viewing Eye and Mouth Movements , 1998, The Journal of Neuroscience.

[22]  Aina Puce,et al.  Common and distinct brain activation to viewing dynamic sequences of face and hand movements , 2007, NeuroImage.

[23]  M. Brass,et al.  Investigating Action Understanding: Inferential Processes versus Action Simulation , 2007, Current Biology.

[24]  R. Zatorre,et al.  Voice-selective areas in human auditory cortex , 2000, Nature.

[25]  Richard S. J. Frackowiak,et al.  Other minds in the brain: a functional imaging study of “theory of mind” in story comprehension , 1995, Cognition.

[26]  Elizabeth Jefferies,et al.  Situating the default-mode network along a principal gradient of macroscale cortical organization , 2016, Proceedings of the National Academy of Sciences.

[27]  Nancy Kanwisher,et al.  Broad domain generality in focal regions of frontal and parietal cortex , 2013, Proceedings of the National Academy of Sciences.

[28]  Brian D. Ripley,et al.  A New Statistical Approach to Detecting Significant Activation in Functional MRI , 2000, NeuroImage.

[29]  Josh H McDermott,et al.  Neural responses to natural and model-matched stimuli reveal distinct computations in primary and nonprimary auditory cortex , 2018, bioRxiv.

[30]  Asif A Ghazanfar,et al.  Interactions between the Superior Temporal Sulcus and Auditory Cortex Mediate Dynamic Face/Voice Integration in Rhesus Monkeys , 2008, The Journal of Neuroscience.

[31]  Giovanni Maria Carlomagno,et al.  Heat flux sensors and infrared thermography , 2007, J. Vis..

[32]  N. Logothetis,et al.  Auditory and Visual Modulation of Temporal Lobe Neurons in Voice-Sensitive and Association Cortices , 2014, The Journal of Neuroscience.

[33]  Krzysztof J. Gorgolewski,et al.  The human voice areas: Spatial organization and inter-individual variability in temporal and extra-temporal cortices , 2015, NeuroImage.

[34]  Jeffery A. Jones,et al.  Multisensory Integration Sites Identified by Perception of Spatial Wavelet Filtered Visual Speech Gesture Information , 2004, Journal of Cognitive Neuroscience.

[35]  H. Bülthoff,et al.  What the Human Brain Likes About Facial Motion , 2012, Cerebral cortex.

[36]  Winrich A. Freiwald,et al.  Contrasting Specializations for Facial Motion within the Macaque Face-Processing System , 2015, Current Biology.

[37]  Josh H. McDermott,et al.  Distinct Cortical Pathways for Music and Speech Revealed by Hypothesis-Free Voxel Decomposition , 2015, Neuron.

[38]  Lauren M DiNicola,et al.  Parallel Distributed Networks Dissociate Episodic and Social Functions Within the Individual. , 2020, Journal of neurophysiology.

[39]  Audrey R. Nath,et al.  fMRI-Guided Transcranial Magnetic Stimulation Reveals That the Superior Temporal Sulcus Is a Cortical Locus of the McGurk Effect , 2010, The Journal of Neuroscience.

[40]  Cheryl M. Capek,et al.  Cortical circuits for silent speechreading in deaf and hearing people , 2008, Neuropsychologia.

[41]  Rebecca Saxe,et al.  Neural Representations of Emotion Are Organized around Abstract Event Features , 2015, Current Biology.

[42]  Franck Ramus,et al.  Interhemispheric differences in auditory processing revealed by fMRI in awake rhesus monkeys. , 2012, Cerebral cortex.

[43]  Elizabeth Redcay,et al.  Perceived communicative intent in gesture and language modulates the superior temporal sulcus , 2016, Human brain mapping.

[44]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[45]  J. Haxby,et al.  The distributed human neural system for face perception , 2000, Trends in Cognitive Sciences.

[46]  Bruce Fischl,et al.  Accurate and robust brain image alignment using boundary-based registration , 2009, NeuroImage.

[47]  Andrew D. Engell,et al.  Distributed representations of dynamic facial expressions in the superior temporal sulcus. , 2010, Journal of vision.

[48]  Stefan J. Kiebel,et al.  Simulation of talking faces in the human brain improves auditory speech recognition , 2008, Proceedings of the National Academy of Sciences.

[49]  Keith Johnson,et al.  Phonetic Feature Encoding in Human Superior Temporal Gyrus , 2014, Science.

[50]  Galit Yovel,et al.  Recognizing People in Motion , 2016, Trends in Cognitive Sciences.

[51]  Stefan J Kiebel,et al.  How the Human Brain Recognizes Speech in the Context of Changing Speakers , 2010, The Journal of Neuroscience.

[52]  Joachim Gross,et al.  “Hearing faces and seeing voices”: Amodal coding of person identity in the human brain , 2016, Scientific Reports.

[53]  James J. DiCarlo,et al.  How Does the Brain Solve Visual Object Recognition? , 2012, Neuron.

[54]  R. Saxe,et al.  Theory of Mind: A Neural Prediction Problem , 2013, Neuron.

[55]  R. Zatorre,et al.  Human temporal-lobe response to vocal sounds. , 2002, Brain research. Cognitive brain research.

[56]  Stephen M. Smith,et al.  Temporal Autocorrelation in Univariate Linear Modeling of FMRI Data , 2001, NeuroImage.

[57]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[58]  Alex Martin,et al.  Species-specific calls activate homologs of Broca's and Wernicke's areas in the macaque , 2006, Nature Neuroscience.

[59]  A. Ishai,et al.  Distributed and Overlapping Representations of Faces and Objects in Ventral Temporal Cortex , 2001, Science.

[60]  B. Ripley,et al.  A new statistical approach to detecting significant activation in functional MRI , 2000, NeuroImage.

[61]  L. Auger The Journal of the Acoustical Society of America , 1949 .

[62]  Stefan J. Kiebel,et al.  Visual face-movement sensitive cortex is relevant for auditory-only speech recognition , 2015, Cortex.

[63]  Rebecca Saxe,et al.  A Common Neural Code for Perceived and Inferred Emotion , 2014, The Journal of Neuroscience.

[64]  T. Allison,et al.  Social perception from visual cues: role of the STS region , 2000, Trends in Cognitive Sciences.

[65]  David Poeppel,et al.  The cortical analysis of speech-specific temporal structure revealed by responses to sound quilts , 2015, Nature Neuroscience.

[66]  Wolfgang Grodd,et al.  Cerebral representation of non-verbal emotional perception: fMRI reveals audiovisual integration area between voice- and face-sensitive regions in the superior temporal sulcus , 2009, Neuropsychologia.

[67]  Julie D. Golomb,et al.  A Neural Basis of Facial Action Recognition in Humans , 2016, The Journal of Neuroscience.

[68]  Pascal Belin,et al.  Crossmodal Adaptation in Right Posterior Superior Temporal Sulcus during Face–Voice Emotional Integration , 2014, The Journal of Neuroscience.

[69]  Nikolaus Kriegeskorte,et al.  Faces and voices in the brain: A modality-general person-identity representation in superior temporal sulcus , 2018, NeuroImage.

[70]  T. Allison,et al.  Functional anatomy of biological motion perception in posterior temporal cortex: an FMRI study of eye, mouth and hand movements. , 2005, Cerebral cortex.

[71]  B. Argall,et al.  Integration of Auditory and Visual Information about Objects in Superior Temporal Sulcus , 2004, Neuron.

[72]  D. Perrett,et al.  A region of right posterior superior temporal sulcus responds to observed intentional actions , 2004, Neuropsychologia.

[73]  Pascal Belin,et al.  Norm-Based Coding of Voice Identity in Human Auditory Cortex , 2013, Current Biology.

[74]  Alfonso Caramazza,et al.  Multimodal representations of person identity individuated with fMRI , 2016, Cortex.

[75]  Galit Yovel,et al.  An Integrated Neural Framework for Dynamic and Static Face Processing , 2018, Scientific Reports.

[76]  J. Kaiser,et al.  Object Familiarity and Semantic Congruency Modulate Responses in Cortical Audiovisual Integration Areas , 2007, The Journal of Neuroscience.

[77]  Galit Yovel,et al.  Two neural pathways of face processing: A critical evaluation of current models , 2015, Neuroscience & Biobehavioral Reviews.

[78]  W. H. Sumby,et al.  Visual contribution to speech intelligibility in noise , 1954 .

[79]  N. Kriegeskorte,et al.  Faces and voices in the brain: A modality-general person-identity representation in superior temporal sulcus , 2018, NeuroImage.

[80]  Nancy Kanwisher,et al.  Divide and conquer: A defense of functional localizers , 2006, NeuroImage.

[81]  Roy D. Patterson,et al.  Neural Representation of Auditory Size in the Human Voice and in Sounds from Other Resonant Sources , 2007, Current Biology.

[82]  Chris I. Baker,et al.  Integration of Visual and Auditory Information by Superior Temporal Sulcus Neurons Responsive to the Sight of Actions , 2005, Journal of Cognitive Neuroscience.

[83]  Nadia Bolognini,et al.  Tuning and disrupting the brain—modulating the McGurk illusion with electrical stimulation , 2014, Front. Hum. Neurosci..

[84]  Gregory McCarthy,et al.  Polysensory interactions along lateral temporal regions evoked by audiovisual speech. , 2003, Cerebral cortex.

[85]  J. Rieger,et al.  Audiovisual Temporal Correspondence Modulates Human Multisensory Superior Temporal Sulcus Plus Primary Sensory Cortices , 2007, The Journal of Neuroscience.

[86]  Daniel D. Dilks,et al.  Differential selectivity for dynamic versus static information in face-selective cortical regions , 2011, NeuroImage.

[87]  Rebecca Saxe,et al.  Thinking about seeing: Perceptual sources of knowledge are encoded in the theory of mind brain regions of sighted and blind adults , 2014, Cognition.

[88]  Pascal Belin,et al.  People-selectivity, audiovisual integration and heteromodality in the superior temporal sulcus , 2014, Cortex.

[89]  M. Peelen,et al.  Supramodal Representations of Perceived Emotions in the Human Brain , 2010, The Journal of Neuroscience.

[90]  Michael S. Beauchamp,et al.  Touch, sound and vision in human superior temporal sulcus , 2008, NeuroImage.

[91]  E. T. Possing,et al.  Human temporal lobe activation by speech and nonspeech sounds. , 2000, Cerebral cortex.

[92]  E. Bullmore,et al.  Activation of auditory cortex during silent lipreading. , 1997, Science.

[93]  Randy L. Buckner,et al.  Parallel Distributed Networks Dissociate Episodic and Social Functions Within the Individual , 2019, bioRxiv.

[94]  D. Pandya,et al.  Afferent cortical connections and architectonics of the superior temporal sulcus and surrounding cortex in the rhesus monkey , 1978, Brain Research.

[95]  K. Kiehl,et al.  Detection of Sounds in the Auditory Stream: Event-Related fMRI Evidence for Differential Activation to Speech and Nonspeech , 2001, Journal of Cognitive Neuroscience.

[96]  I. Johnsrude,et al.  The problem of functional localization in the human brain , 2002, Nature Reviews Neuroscience.