Human‐Object Interactions Are More than the Sum of Their Parts

Abstract Understanding human‐object interactions is critical for extracting meaning from everyday visual scenes and requires integrating complex relationships between human pose and object identity into a new percept. To understand how the brain builds these representations, we conducted 2 fMRI experiments in which subjects viewed humans interacting with objects, noninteracting human‐object pairs, and isolated humans and objects. A number of visual regions process features of human‐object interactions, including object identity information in the lateral occipital complex (LOC) and parahippocampal place area (PPA), and human pose information in the extrastriate body area (EBA) and posterior superior temporal sulcus (pSTS). Representations of human‐object interactions in some regions, such as the posterior PPA (retinotopic maps PHC1 and PHC2) are well predicted by a simple linear combination of the response to object and pose information. Other regions, however, especially pSTS, exhibit representations for human‐object interaction categories that are not predicted by their individual components, indicating that they encode human‐object interactions as more than the sum of their parts. These results reveal the distributed networks underlying the emergent representation of human‐object interactions necessary for social perception.

[1]  Peter Auer,et al.  Generic object recognition with boosting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  J. Decety,et al.  The Role of the Right Temporoparietal Junction in Social Interaction: How Low-Level Computational Processes Contribute to Meta-Cognition , 2007, The Neuroscientist : a review journal bringing neurobiology, neurology and psychiatry.

[3]  Liang Wang,et al.  Probabilistic Maps of Visual Topography in Human Cortex. , 2015, Cerebral cortex.

[4]  R Saxe,et al.  People thinking about thinking people The role of the temporo-parietal junction in “theory of mind” , 2003, NeuroImage.

[5]  James J DiCarlo,et al.  Multiple Object Response Normalization in Monkey Inferotemporal Cortex , 2005, The Journal of Neuroscience.

[6]  M. Brass,et al.  Investigating Action Understanding: Inferential Processes versus Action Simulation , 2007, Current Biology.

[7]  Julie Grèzes,et al.  The influence of visual and motor familiarity during action observation: An fMRI study using expertise. , 2005 .

[8]  Angela R. Laird,et al.  ALE meta-analysis of action observation and imitation in the human brain , 2010, NeuroImage.

[9]  Gregory Hickok,et al.  Eight Problems for the Mirror Neuron Theory of Action Understanding in Monkeys and Humans , 2009, Journal of Cognitive Neuroscience.

[10]  D. Perrett,et al.  A region of right posterior superior temporal sulcus responds to observed intentional actions , 2004, Neuropsychologia.

[11]  Russell A. Epstein,et al.  Decoding the Representation of Multiple Simultaneous Objects in Human Occipitotemporal Cortex , 2009, Current Biology.

[12]  Leonidas J. Guibas,et al.  Human action recognition by learning bases of action attributes and parts , 2011, 2011 International Conference on Computer Vision.

[13]  G. Rizzolatti,et al.  Neural Circuits Involved in the Recognition of Actions Performed by Nonconspecifics: An fMRI Study , 2004, Journal of Cognitive Neuroscience.

[14]  A. Braun,et al.  Symbolic gestures and spoken language are processed by a common neural system , 2009, Proceedings of the National Academy of Sciences.

[15]  Yi Chen,et al.  Statistical inference and multiple testing correction in classification-based multi-voxel pattern analysis (MVPA): Random permutations and cluster size control , 2011, NeuroImage.

[16]  J. Wagemans,et al.  Brain-decoding fMRI reveals how wholes relate to the sum of parts , 2015, Cortex.

[17]  Katherine L. Roberts,et al.  Action relations facilitate the identification of briefly-presented objects , 2011, Attention, perception & psychophysics.

[18]  F. Binkofski,et al.  The mirror neuron system and action recognition , 2004, Brain and Language.

[19]  R. Passingham,et al.  Brain Mechanisms for Inferring Deceit in the Actions of Others , 2004, The Journal of Neuroscience.

[20]  I. Biederman,et al.  Scene perception: Detecting and judging objects undergoing relational violations , 1982, Cognitive Psychology.

[21]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[22]  Fei-Fei Li,et al.  Differential connectivity within the Parahippocampal Place Area , 2013, NeuroImage.

[23]  K. Grill-Spector,et al.  Differential development of high-level visual cortex correlates with category-specific recognition memory , 2007, Nature Neuroscience.

[24]  H. Pashler,et al.  Puzzlingly High Correlations in fMRI Studies of Emotion, Personality, and Social Cognition 1 , 2009, Perspectives on psychological science : a journal of the Association for Psychological Science.

[25]  R. Passingham,et al.  Seeing or Doing? Influence of Visual and Motor Familiarity in Action Observation , 2006, Current Biology.

[26]  Stephen C. Want,et al.  How do children ape? Applying concepts from the study of non-human primates to the developmental study of 'imitation' in children , 2002 .

[27]  Hans-Otto Karnath,et al.  The role of temporo-parietal junction (TPJ) in global Gestalt perception , 2011, Brain Structure and Function.

[28]  G. Humphreys,et al.  Seeing the action: neuropsychological evidence for action-based effects on object selection , 2003, Nature Neuroscience.

[29]  John E Hummel,et al.  Familiar interacting object pairs are perceptually grouped. , 2006, Journal of experimental psychology. Human perception and performance.

[30]  Dwight J. Kravitz,et al.  The ventral visual pathway: an expanded neural framework for the processing of object quality , 2013, Trends in Cognitive Sciences.

[31]  Pierre Tirilly,et al.  Language modeling for bag-of-visual words image categorization , 2008, CIVR '08.

[32]  Marc Hauser,et al.  Evolving the capacity to understand actions, intentions, and goals. , 2010, Annual review of psychology.

[33]  R W Cox,et al.  AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. , 1996, Computers and biomedical research, an international journal.

[34]  Thomas E. Nichols,et al.  Can parametric statistical methods be trusted for fMRI based group studies? , 2015, 1511.01863.

[35]  Kevin A. Pelphrey,et al.  Grasping the Intentions of Others: The Perceived Intentionality of an Action Influences Activity in the Superior Temporal Sulcus during Social Perception , 2004, Journal of Cognitive Neuroscience.

[36]  Dirk B. Walther,et al.  Natural Scene Categories Revealed in Distributed Patterns of Activity in the Human Brain , 2009, The Journal of Neuroscience.

[37]  Mikko Sams,et al.  Naturalistic fMRI Mapping Reveals Superior Temporal Sulcus as the Hub for the Distributed Brain Network for Social Perception , 2012, Front. Hum. Neurosci..

[38]  Jiye G. Kim,et al.  Where do objects become scenes? , 2011, Cerebral cortex.

[39]  Alfonso Caramazza,et al.  Asymmetric fMRI adaptation reveals no evidence for mirror neurons in humans , 2009, Proceedings of the National Academy of Sciences.

[40]  R. Saxe Uniquely human social cognition , 2006, Current Opinion in Neurobiology.

[41]  G. Rizzolatti,et al.  Action observation activates premotor and parietal areas in a somatotopic manner: an fMRI study , 2001, The European journal of neuroscience.

[42]  Matthew D. Lieberman,et al.  Identifying the What, Why, and How of an Observed Action: An fMRI Study of Mentalizing and Mechanizing during Action Observation , 2011, Journal of Cognitive Neuroscience.

[43]  Laurel J Buxbaum,et al.  Critical brain regions for action recognition: lesion symptom mapping in left hemisphere stroke. , 2010, Brain : a journal of neurology.

[44]  Hans Knutsson,et al.  Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates , 2016, Proceedings of the National Academy of Sciences.

[45]  Cordelia Schmid,et al.  Accurate Object Localization with Shape Masks , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Robert T. Knight,et al.  Superior Temporal SulcusIt's My Area: Or Is It? , 2008, Journal of Cognitive Neuroscience.

[47]  N. Kanwisher,et al.  fMRI Adaptation Reveals Mirror Neurons in Human Inferior Parietal Cortex , 2008, Current Biology.

[48]  G. Goldenberg Apraxia and the parietal lobes , 2009, Neuropsychologia.

[49]  Glyn W. Humphreys,et al.  Action relationships concatenate representations of separate objects in the ventral visual system , 2010, NeuroImage.

[50]  Kalanit Grill-Spector,et al.  Not one extrastriate body area: Using anatomical landmarks, hMT+, and visual field maps to parcellate limb-selective activations in human lateral occipitotemporal cortex , 2011, NeuroImage.

[51]  Glyn W. Humphreys,et al.  The Neural Selection and Integration of Actions and Objects: An fMRI Study , 2012, Journal of Cognitive Neuroscience.

[52]  G. Csibra Action mirroring and action understanding: an alternative account , 1993 .

[53]  Johan Wagemans,et al.  The distributed representation of random and meaningful object pairs in human occipitotemporal cortex: The weighted average as a general rule , 2013, NeuroImage.

[54]  HeinGrit,et al.  Superior temporal sulcus---it's my area , 2008 .

[55]  Paul E. Downing,et al.  Representation of Action in Occipito-temporal Cortex , 2011, Journal of Cognitive Neuroscience.

[56]  Katharina N. Seidl,et al.  Whole person-evoked fMRI activity patterns in human fusiform gyrus are accurately modeled by a linear combination of face- and body-evoked activity patterns. , 2014, Journal of neurophysiology.

[57]  James J. DiCarlo,et al.  How Does the Brain Solve Visual Object Recognition? , 2012, Neuron.

[58]  Benjamin D. Singer,et al.  Retinotopic Organization of Human Ventral Visual Cortex , 2009, The Journal of Neuroscience.

[59]  P. Haggard,et al.  Sensorimotor foundations of higher cognition , 1993 .

[60]  R. Passingham,et al.  Action observation and acquired motor skills: an FMRI study with expert dancers. , 2005, Cerebral cortex.

[61]  Kenneth F. Valyear,et al.  Human parietal cortex in action , 2006, Current Opinion in Neurobiology.

[62]  P. Downing,et al.  The role of occipitotemporal body-selective regions in person perception , 2011, Cognitive neuroscience.

[63]  Jitendra Malik,et al.  Shape Guided Object Segmentation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[64]  Anjan Chatterjee,et al.  Specificity of Action Representations in the Lateral Occipitotemporal Cortex , 2006, Journal of Cognitive Neuroscience.

[65]  Li Fei-Fei,et al.  Parcellating connectivity in spatial maps , 2015, PeerJ.

[66]  R. Saxe Against simulation: the argument from error , 2005, Trends in Cognitive Sciences.

[67]  Ulrike Wirth Content Based Image And Video Retrieval , 2016 .

[68]  Shawn C. Milleville,et al.  Understanding Animate Agents , 2007, Psychological science.

[69]  Russell A. Epstein,et al.  Constructing scenes from objects in human occipitotemporal cortex , 2011, Nature Neuroscience.

[70]  G. Rizzolatti,et al.  The mirror-neuron system. , 2004, Annual review of neuroscience.