论文信息 - Modelling Visual Properties and Visual Context in Multimodal Semantics

Modelling Visual Properties and Visual Context in Multimodal Semantics

Multimodal semantic models that extend linguistic representations with additional perceptual input have proved successful in a range of natural language processing (NLP) tasks. However, existing research has extracted visual features from complete images, and has not examined how different kinds of visual information impact performance. We construct multimodal models that differentiate between internal visual properties of the objects and their external visual context. We evaluate the models on the task of decoding brain activity associated with the meanings of nouns, demonstrating their advantage over those based on complete images.

L. Bulat | Christopher Davis | A. Vero

[1] A. Ishai,et al. Distributed and Overlapping Representations of Faces and Objects in Ventral Temporal Cortex , 2001, Science.

[2] Tom Michael Mitchell,et al. Predicting Human Brain Activity Associated with the Meanings of Nouns , 2008, Science.

[3] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[4] Anna Korhonen,et al. Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora , 2010, HLT-NAACL 2010.

[5] Max M. Louwerse,et al. Symbol Interdependency in Symbolic and Embodied Cognition , 2011, Top. Cogn. Sci..

[6] Tom M. Mitchell,et al. Selecting Corpus-Semantic Models for Neurolinguistic Decoding , 2012, *SEMEVAL.

[7] Nitish Srivastava,et al. Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[8] Nicu Sebe,et al. Distributional semantics with eyes: using image analysis to improve computational representations of word meaning , 2012, ACM Multimedia.

[9] Jack L. Gallant,et al. A Continuous Semantic Space Describes the Representation of Thousands of Object and Action Categories across the Human Brain , 2012, Neuron.

[10] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[11] Francisco Pereira,et al. Using Wikipedia to learn semantic feature representations of concrete concepts in neuroimaging experiments , 2013, Artif. Intell..