Decoding natural scenes based on sounds of objects within scenes using multivariate pattern analysis

Scene recognition plays an important role in spatial navigation and scene classification. It remains unknown whether the occipitotemporal cortex could represent the semantic association between the scenes and sounds of objects within the scenes. In this study, we used the functional magnetic resonance imaging (fMRI) technique and multivariate pattern analysis to assess whether diff ;erent scenes could be discriminated based on the patterns evoked by sounds of objects within the scenes. We found that patterns evoked by scenes could be predicted with patterns evoked by sounds of objects within the scenes in the posterior fusiform area (pF), lateral occipital area (LO) and superior temporal sulcus (STS). The further functional connectivity analysis suggested significant correlations between pF, LO and parahippocampal place area (PPA) except that between STS and other three regions under the scene and sound conditions. A distinct network in processing scenes and sounds was discovered using a seed-to-voxel analysis with STS as the seed. This study may provide a cross-modal channel of scene decoding through the sounds of objects within the scenes in the occipitotemporal cortex, which could complement the single-modal channel of scene decoding based on the global scene properties or objects within the scenes.

[1]  Russell A. Epstein,et al.  The Parahippocampal Place Area Recognition, Navigation, or Encoding? , 1999, Neuron.

[2]  A. Ishai,et al.  Distributed and Overlapping Representations of Faces and Objects in Ventral Temporal Cortex , 2001, Science.

[3]  M. L. Lambon Ralph,et al.  The Neural Organization of Semantic Control: TMS Evidence for a Distributed Network in Left Inferior Frontal and Posterior Middle Temporal Gyrus , 2010, Cerebral cortex.

[4]  Aude Oliva,et al.  Parametric Coding of the Size and Clutter of Natural Scenes in the Human Brain. , 2014, Cerebral cortex.

[5]  U. Noppeney,et al.  Distinct Functional Contributions of Primary Sensory and Association Areas to Audiovisual Integration in Object Categorization , 2010, The Journal of Neuroscience.

[6]  P. Goldman-Rakic,et al.  Sustained Mnemonic Response in the Human Middle Frontal Gyrus during On-Line Storage of Spatial Memoranda , 2002, Journal of Cognitive Neuroscience.

[7]  Christopher D. Chambers,et al.  Current perspectives and methods in studying neural mechanisms of multisensory interactions , 2012, Neuroscience & Biobehavioral Reviews.

[8]  M. Murray,et al.  Multisensory Integration: Flexible Use of General Operations , 2014, Neuron.

[9]  Ryan J. Prenger,et al.  Bayesian Reconstruction of Natural Images from Human Brain Activity , 2009, Neuron.

[10]  Dirk B. Walther,et al.  Natural Scene Categories Revealed in Distributed Patterns of Activity in the Human Brain , 2009, The Journal of Neuroscience.

[11]  Pascal Belin,et al.  People-selectivity, audiovisual integration and heteromodality in the superior temporal sulcus , 2014, Cortex.

[12]  Remco J. Renken,et al.  Lateral and Medial Ventral Occipitotemporal Regions Interact During the Recognition of Images Revealed from Noise , 2016, Front. Hum. Neurosci..

[13]  J. Gallant,et al.  Identifying natural images from human brain activity , 2008, Nature.

[14]  M. Mesulam,et al.  From sensation to cognition. , 1998, Brain : a journal of neurology.

[15]  Chih-Jen Lin,et al.  Combining SVMs with Various Feature Selection Strategies , 2006, Feature Extraction.

[16]  Katharina von Kriegstein,et al.  Mechanisms of enhancing visual–speech recognition by prior auditory information , 2013, NeuroImage.

[17]  R. Zatorre,et al.  Voice-selective areas in human auditory cortex , 2000, Nature.

[18]  Soojin Park,et al.  Disentangling Scene Content from Spatial Boundary: Complementary Roles for the Parahippocampal Place Area and Lateral Occipital Complex in Representing Real-World Scenes , 2011, The Journal of Neuroscience.

[19]  Drew Linsley,et al.  Evidence for participation by object-selective visual cortex in scene category judgments. , 2014, Journal of vision.

[20]  B. Argall,et al.  Integration of Auditory and Visual Information about Objects in Superior Temporal Sulcus , 2004, Neuron.

[21]  Michelle R. Greene,et al.  Recognition of natural scenes from global properties: Seeing the forest without representing the trees , 2009, Cognitive Psychology.

[22]  Russell A. Epstein,et al.  Cortical correlates of face and scene inversion: A comparison , 2006, Neuropsychologia.

[23]  S. Kosslyn,et al.  Neural foundations of imagery , 2001, Nature Reviews Neuroscience.

[24]  A. Ghazanfar,et al.  Is neocortex essentially multisensory? , 2006, Trends in Cognitive Sciences.

[25]  Michael S Beauchamp,et al.  See me, hear me, touch me: multisensory integration in lateral occipital-temporal cortex , 2005, Current Opinion in Neurobiology.

[26]  Russell A. Epstein,et al.  Where Am I Now? Distinct Roles for Parahippocampal and Retrosplenial Cortices in Place Recognition , 2007, The Journal of Neuroscience.

[27]  A. Amedi,et al.  Functional imaging of human crossmodal identification and object recognition , 2005, Experimental Brain Research.

[28]  F. Tong,et al.  Decoding reveals the contents of visual working memory in early visual areas , 2009, Nature.

[29]  Dwight J. Kravitz,et al.  Real-World Scene Representations in High-Level Visual Cortex: It's the Spaces More Than the Places , 2011, The Journal of Neuroscience.

[30]  D. Perani,et al.  Epidural premotor cortical stimulation in primary focal dystonia: Clinical and 18F‐fluoro deoxyglucose positron emission tomography open study , 2012, Movement disorders : official journal of the Movement Disorder Society.

[31]  Uta Noppeney,et al.  Distinct Computational Principles Govern Multisensory Integration in Primary Sensory and Association Cortices , 2016, Current Biology.

[32]  Jack L. Gallant,et al.  Natural Scene Statistics Account for the Representation of Scene Categories in Human Visual Cortex , 2013, Neuron.

[33]  G. Aguirre,et al.  Different spatial scales of shape similarity representation in lateral and ventral LOC. , 2009, Cerebral cortex.

[34]  Rudolf Nieuwenhuys,et al.  The insular cortex: a review. , 2012, Progress in brain research.

[35]  N. Kanwisher,et al.  Multivariate Patterns in Object-Selective Cortex Dissociate Perceptual and Physical Shape Similarity , 2008, PLoS biology.

[36]  G. Calvert Crossmodal processing in the human brain: insights from functional neuroimaging studies. , 2001, Cerebral cortex.

[37]  Karl J. Friston,et al.  Anterior insular cortex and emotional awareness , 2013, The Journal of comparative neurology.

[38]  Timothy L Hubbard,et al.  Auditory imagery: empirical findings. , 2010, Psychological bulletin.

[39]  Marcia K. Johnson,et al.  Decoding individual natural scene representations during perception and imagery , 2010, Front. Hum. Neurosci..

[40]  Russell A. Epstein,et al.  Decoding the Representation of Multiple Simultaneous Objects in Human Occipitotemporal Cortex , 2009, Current Biology.

[41]  Nancy Kanwisher,et al.  A cortical representation of the local visual environment , 1998, Nature.

[42]  Andrew D. Engell,et al.  Distributed representations of dynamic facial expressions in the superior temporal sulcus. , 2010, Journal of vision.

[43]  Lars Muckli,et al.  Decoding Sound and Imagery Content in Early Visual Cortex , 2014, Current Biology.

[44]  A. Caramazza,et al.  Object Domain and Modality in the Ventral Visual Pathway , 2016, Trends in Cognitive Sciences.

[45]  Paul E. Downing,et al.  Viewpoint-Specific Scene Representations in Human Parahippocampal Cortex , 2003, Neuron.

[46]  Russell A. Epstein,et al.  Constructing scenes from objects in human occipitotemporal cortex , 2011, Nature Neuroscience.

[47]  Susan L. Whitfield-Gabrieli,et al.  Conn: A Functional Connectivity Toolbox for Correlated and Anticorrelated Brain Networks , 2012, Brain Connect..

[48]  Gregory Hickok,et al.  Auditory, Visual and Audiovisual Speech Processing Streams in Superior Temporal Sulcus , 2017, Front. Hum. Neurosci..

[49]  Hans-Jochen Heinze,et al.  Neural basis of multisensory looming signals , 2013, NeuroImage.

[50]  Jitendra Malik,et al.  When is scene identification just texture recognition? , 2004, Vision Research.

[51]  Jochen Kaiser,et al.  Audiovisual Functional Magnetic Resonance Imaging Adaptation Reveals Multisensory Integration Effects in Object-Related Sensory Cortices , 2010, The Journal of Neuroscience.

[52]  John M. Henderson,et al.  Cortical activation to indoor versus outdoor scenes: an fMRI study , 2007, Experimental Brain Research.

[53]  A. Mouraux,et al.  Primary sensory cortices contain distinguishable spatial patterns of activity for each sense , 2013, Nature Communications.

[54]  Yi Chen,et al.  Statistical inference and multiple testing correction in classification-based multi-voxel pattern analysis (MVPA): Random permutations and cluster size control , 2011, NeuroImage.

[55]  John J. Foxe,et al.  Multisensory processing of naturalistic objects in motion: A high-density electrical mapping and source estimation study , 2007, NeuroImage.

[56]  Dwight J. Kravitz,et al.  Deconstructing visual scenes in cortex: gradients of object and spatial layout information. , 2013, Cerebral cortex.

[57]  R. Malach,et al.  Object-related activity revealed by functional magnetic resonance imaging in human occipital cortex. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[58]  Li Fei-Fei,et al.  Simple line drawings suffice for functional MRI decoding of natural scene categories , 2011, Proceedings of the National Academy of Sciences.

[59]  Sean M. Polyn,et al.  Beyond mind-reading: multi-voxel pattern analysis of fMRI data , 2006, Trends in Cognitive Sciences.