Auditory and Semantic Cues Facilitate Decoding of Visual Object Category in MEG.

Sounds (e.g., barking) help us to visually identify objects (e.g., a dog) that are distant or ambiguous. While neuroimaging studies have revealed neuroanatomical sites of audiovisual interactions, little is known about the time course by which sounds facilitate visual object processing. Here we used magnetoencephalography to reveal the time course of the facilitatory influence of natural sounds (e.g., barking) on visual object processing and compared this to the facilitatory influence of spoken words (e.g., "dog"). Participants viewed images of blurred objects preceded by a task-irrelevant natural sound, a spoken word, or uninformative noise. A classifier was trained to discriminate multivariate sensor patterns evoked by animate and inanimate intact objects with no sounds, presented in a separate experiment, and tested on sensor patterns evoked by the blurred objects in the 3 auditory conditions. Results revealed that both sounds and words, relative to uninformative noise, significantly facilitated visual object category decoding between 300-500 ms after visual onset. We found no evidence for earlier facilitation by sounds than by words. These findings provide evidence for a semantic route of facilitation by both natural sounds and spoken words, whereby the auditory input first activates semantic object representations, which then modulate the visual processing of objects.

[1]  George A Alvarez,et al.  Mid-level perceptual features contain early cues to animacy. , 2017, Journal of vision.

[2]  Yi Chen,et al.  Statistical inference and multiple testing correction in classification-based multi-voxel pattern analysis (MVPA): Random permutations and cluster size control , 2011, NeuroImage.

[3]  Micah M. Murray,et al.  The context-contingent nature of cross-modal activations of the visual cortex , 2016, NeuroImage.

[4]  R. Versace,et al.  The perceptual nature of audiovisual interactions for semantic knowledge in young and elderly adults. , 2013, Acta psychologica.

[5]  Gregor Thut,et al.  The multisensory function of the human primary visual cortex , 2016, Neuropsychologia.

[6]  John J. Foxe,et al.  Multisensory auditory-visual interactions during early sensory processing in humans: a high-density electrical mapping study. , 2002, Brain research. Cognitive brain research.

[7]  James V. Haxby,et al.  CoSMoMVPA: Multi-Modal Multivariate Pattern Analysis of Neuroimaging Data in Matlab/GNU Octave , 2016, bioRxiv.

[8]  Janneke F. M. Jehee,et al.  Less Is More: Expectation Sharpens Representations in the Primary Visual Cortex , 2012, Neuron.

[9]  Daria Proklova,et al.  MEG sensor patterns reflect perceptual but not categorical similarity of animate and inanimate objects , 2018, NeuroImage.

[10]  Dissociating the time courses of the cross-modal semantic priming effects elicited by naturalistic sounds and spoken words , 2017, Psychonomic bulletin & review.

[11]  C. Spence Crossmodal correspondences: A tutorial review , 2011, Attention, perception & psychophysics.

[12]  Lars Muckli,et al.  Decoding Sound and Imagery Content in Early Visual Cortex , 2014, Current Biology.

[13]  John J. Foxe,et al.  Multisensory visual-auditory object recognition in humans: a high-density electrical mapping study. , 2004, Cerebral cortex.

[14]  Marius V Peelen,et al.  Journal of Experimental Psychology : General Content-Specific Expectations Enhance Stimulus Detectability by Increasing Perceptual Sensitivity , 2015 .

[15]  Pascal Belin,et al.  People-selectivity, audiovisual integration and heteromodality in the superior temporal sulcus , 2014, Cortex.

[16]  M. Giard,et al.  Auditory-Visual Integration during Multimodal Object Recognition in Humans: A Behavioral and Electrophysiological Study , 1999, Journal of Cognitive Neuroscience.

[17]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[18]  U. Noppeney,et al.  Distinct Functional Contributions of Primary Sensory and Association Areas to Audiovisual Integration in Object Categorization , 2010, The Journal of Neuroscience.

[19]  Gregor Thut,et al.  Preperceptual and Stimulus-Selective Enhancement of Low-Level Human Visual Cortex Excitability by Sounds , 2009, Current Biology.

[20]  Geraint Rees,et al.  Auditory modulation of visual stimulus encoding in human retinotopic cortex , 2013, NeuroImage.

[21]  Uta Noppeney,et al.  How prior expectations shape multisensory perception , 2016, NeuroImage.

[22]  Charles Spence,et al.  Crossmodal semantic priming by naturalistic sounds and spoken words enhances visual sensitivity. , 2011, Journal of experimental psychology. Human perception and performance.

[23]  F. Dick,et al.  An on-line task for contrasting auditory processing in the verbal and nonverbal domains and norms for younger and older adults , 2005, Behavior research methods.

[24]  A. Oliva,et al.  A Real-World Size Organization of Object Responses in Occipitotemporal Cortex , 2012, Neuron.

[25]  F. Keil,et al.  Efficient visual search by category: Specifying the features that mark the difference between artifacts and animals in preattentive vision , 2001, Perception & psychophysics.

[26]  Leslie G. Ungerleider,et al.  Bottom-up processing of curvilinear visual features is sufficient for animate/inanimate object categorization , 2018, Journal of vision.

[27]  Stephen M. Smith,et al.  Threshold-free cluster enhancement: Addressing problems of smoothing, threshold dependence and localisation in cluster inference , 2009, NeuroImage.

[28]  Gary Lupyan,et al.  Language can boost otherwise unseen objects into visual awareness , 2013, Proceedings of the National Academy of Sciences.

[29]  J. Pernier,et al.  Early auditory-visual interactions in human cortex during nonredundant target identification. , 2002, Brain research. Cognitive brain research.

[30]  M. Grabowecky,et al.  Characteristic sounds make you look at target objects more quickly , 2010, Attention, perception & psychophysics.

[31]  D. Senkowski,et al.  The multifaceted interplay between attention and multisensory integration , 2010, Trends in Cognitive Sciences.

[32]  David A. Tovar,et al.  Representational dynamics of object vision: the first 1000 ms. , 2013, Journal of vision.

[33]  R. Fleming,et al.  Perceiving animacy from shape. , 2017, Journal of vision.

[34]  Gregor Thut,et al.  Auditory–Visual Multisensory Interactions in Humans: Timing, Topography, Directionality, and Sources , 2010, The Journal of Neuroscience.

[35]  Marius V. Peelen,et al.  Interaction between Scene and Object Processing Revealed by Human fMRI and MEG Decoding , 2017, The Journal of Neuroscience.

[36]  K. Grill-Spector,et al.  The functional architecture of the ventral temporal cortex and its role in categorization , 2014, Nature Reviews Neuroscience.

[37]  Michael S Beauchamp,et al.  See me, hear me, touch me: multisensory integration in lateral occipital-temporal cortex , 2005, Current Opinion in Neurobiology.

[38]  Yongju Kim,et al.  Conceptual priming with pictures and environmental sounds. , 2014, Acta psychologica.

[39]  G. Lupyan,et al.  What makes words special? Words as unmotivated cues , 2015, Cognition.

[40]  Marius V Peelen,et al.  The Neural Dynamics of Attentional Selection in Natural Scenes , 2016, The Journal of Neuroscience.

[41]  S. Taulu,et al.  Spatiotemporal signal space separation method for rejecting nearby interference in MEG measurements , 2006, Physics in medicine and biology.

[42]  Radoslaw Martin Cichy,et al.  Resolving human object recognition in space and time , 2014, Nature Neuroscience.

[43]  Uta Noppeney,et al.  Causal inference and temporal predictions in audiovisual perception of speech and music , 2018, Annals of the New York Academy of Sciences.

[44]  P. Goolkasian,et al.  Target categorization with primes that vary in both congruency and sense modality , 2015, Front. Psychol..

[45]  N. Kanwisher,et al.  Domain specificity in visual cortex. , 2006, Cerebral cortex.

[46]  Floris P de Lange,et al.  Prior expectations induce prestimulus sensory templates , 2017, Proceedings of the National Academy of Sciences.