Visual Information in Semantic Representation

The question of how meaning might be acquired by young children and represented by adult speakers of a language is one of the most debated topics in cognitive science. Existing semantic representation models are primarily amodal based on information provided by the linguistic input despite ample evidence indicating that the cognitive system is also sensitive to perceptual information. In this work we exploit the vast resource of images and associated documents available on the web and develop a model of multimodal meaning representation which is based on the linguistic and visual context. Experimental results show that a closer correspondence to human data can be obtained by taking the visual modality into account.

[1]  Jacob Cohen,et al.  Applied multiple regression/correlation analysis for the behavioral sciences , 1979 .

[2]  Roger K. Moore Computer Speech and Language , 1986 .

[3]  Casimir A. Kulikowski,et al.  Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning and Expert Systems , 1990 .

[4]  Linda B. Smith,et al.  Object properties and knowledge in early lexical learning. , 1991, Child development.

[5]  P. D. Eimas,et al.  Evidence for Representations of Perceptually Similar Natural Categories by 3-Month-Old and 4-Month-Old Infants , 1993, Perception.

[6]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[7]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Linda B. Smith,et al.  Object perception and object naming in early development , 1998, Trends in Cognitive Sciences.

[9]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[10]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.

[11]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[12]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[13]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[14]  James L. McClelland,et al.  Structure and deterioration of semantic memory: a neuropsychological and computational investigation. , 2004, Psychological review.

[15]  M. Bornstein,et al.  Cross-linguistic analysis of vocabulary in young children: spanish, dutch, French, hebrew, italian, korean, and american english. , 2004, Child development.

[16]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[17]  Christos Faloutsos,et al.  Automatic image captioning , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[18]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[19]  Chen Yu,et al.  The emergence of links between lexical acquisition and object categorization: a computational study , 2005, Connect. Sci..

[20]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Mark Steyvers,et al.  Topics in semantic representation. , 2007, Psychological review.

[22]  W. Eric L. Grimson,et al.  Spatial Latent Dirichlet Allocation , 2007, NIPS.

[23]  Daniel Gatica-Perez,et al.  Modeling Semantic Aspects for Cross-Media Image Indexing , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  David M. Blei,et al.  Syntactic Topic Models , 2008, NIPS.

[25]  Yansong Feng,et al.  Automatic Image Annotation Using Auxiliary Text Information , 2008, ACL.

[26]  Gabriella Vigliocco,et al.  Integrating experiential and distributional data to learn semantic representations. , 2009, Psychological review.

[27]  Saif Mohammad,et al.  Estimating Semantic Distance Using Soft Semantic Constraints in Knowledge-Source – Corpus Hybrid Models , 2009, EMNLP.

[28]  Katja Markert,et al.  A Comparison of Windowless and Window-Based Computational Association Measures as Predictors of Syntagmatic Human Associations , 2009, EMNLP.

[29]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[30]  Deborah K Eakin,et al.  ListChecker Pro 1.2: A program designed to facilitate creating word lists using the University of South Florida word association norms , 2010, Behavior research methods.