A vision-grounded dataset for predicting typical locations for verbs

Information about the location of an action is often implicit in text, as humans can infer it based on common sense knowledge. Today’s NLP systems however struggle with inferring information that goes beyond what is explicit in text. Selectional preference estimation based on large amounts of data provides a way to infer prototypical role fillers, but text-based systems tend to underestimate the probability of the most typical role fillers. We here present a new dataset containing thematic fit judgments for 2,000 verb/location pairs. This dataset can be used for evaluating text-based, vision-based or multimodal inference systems for the typicality of an event’s location. We additionally provide three thematic fit baselines for this dataset: a state-of-the-art neural networks based thematic fit model learned from linguistic data, a model estimating typical locations based on the MSCOCO dataset and a simple combination of the systems.

[1]  Vera Demberg,et al.  Improving unsupervised vector-space thematic fit evaluation via role-filler prototype clustering , 2015, NAACL.

[2]  Alessandro Lenci,et al.  Composing and Updating Verb Argument Expectations: A Distributional Semantic Model , 2011, CMCL@ACL.

[3]  Ross B. Girshick,et al.  Seeing through the Human Reporting Bias: Visual Classifiers from Noisy Human-Centric Labels , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Walter Daelemans,et al.  A Robust and Extensible Exemplar-Based Model of Thematic Fit , 2009, EACL.

[5]  Dietrich Klakow,et al.  Event participant modelling with neural networks , 2016, EMNLP.

[6]  Gerhard Weikum,et al.  Commonsense in Parts: Mining Part-Whole Relations from the Web and Image Tags , 2016, AAAI.

[7]  Frank Keller,et al.  A Probabilistic Model of Semantic Plausibility in Sentence Processing , 2009, Cogn. Sci..

[8]  Asad B. Sayeed,et al.  An exploration of semantic features in an unsupervised thematic fit evaluation framework , 2015 .

[9]  Todd R. Ferretti,et al.  Thematic Roles as Verb-specific Concepts , 1997 .

[10]  Benjamin Van Durme,et al.  Reporting bias and knowledge acquisition , 2013, AKBC '13.

[11]  Vera Demberg,et al.  Thematic fit evaluation: an aspect of selectional preferences , 2016, RepEval@ACL.

[12]  Ali Farhadi,et al.  Stating the Obvious: Extracting Visual Common Sense Knowledge , 2016, NAACL.

[13]  Georgiana Dinu,et al.  Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors , 2014, ACL.

[14]  Katrin Erk,et al.  A Simple, Similarity-based Model for Selectional Preferences , 2007, ACL.

[15]  K. McRae,et al.  Integrating Verbs, Situation Schemas, and Thematic Role Concepts , 2001 .

[16]  Bernt Schiele,et al.  Grounding Action Descriptions in Videos , 2013, TACL.

[17]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[18]  Silvia Bernardini,et al.  Introducing and evaluating ukWaC , a very large web-derived corpus of English , 2008 .

[19]  Alex Acero,et al.  Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lo , 2006, Comput. Speech Lang..

[20]  Carina Silberer,et al.  Learning Grounded Meaning Representations with Autoencoders , 2014, ACL.

[21]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[22]  Vera Demberg,et al.  LingoTurk: managing crowdsourced tasks for psycholinguistics , 2016, NAACL.

[23]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[24]  Alessandro Lenci,et al.  Distributional Memory: A General Framework for Corpus-Based Semantics , 2010, CL.

[25]  Ali Farhadi,et al.  Situation Recognition: Visual Semantic Role Labeling for Image Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Hector J. Levesque,et al.  The Winograd Schema Challenge , 2011, AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning.

[28]  Zornitsa Kozareva,et al.  SemEval-2012 Task 7: Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning , 2011, *SEMEVAL.

[29]  Nathanael Chambers,et al.  A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories , 2016, NAACL.

[30]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.