Figure Me Out: A Gold Standard Dataset for Metaphor Interpretation

Metaphor comprehension and understanding is a complex cognitive task that requires interpreting metaphors by grasping the interaction between the meaning of their target and source concepts. This is very challenging for humans, let alone computers. Thus, automatic metaphor interpretation is understudied in part due to the lack of publicly available datasets. The creation and manual annotation of such datasets is a demanding task which requires huge cognitive effort and time. Moreover, there will always be a question of accuracy and consistency of the annotated data due to the subjective nature of the problem. This work addresses these issues by presenting an annotation scheme to interpret verb-noun metaphoric expressions in text. The proposed approach is designed with the goal of reducing the workload on annotators and maintain consistency. Our methodology employs an automatic retrieval approach which utilises external lexical resources, word embeddings and semantic similarity to generate possible interpretations of identified metaphors in order to enable quick and accurate annotation. We validate our proposed approach by annotating around 1,500 metaphors in tweets which were annotated by six native English speakers. As a result of this work, we publish as linked data the first gold standard dataset for metaphor interpretation which will facilitate research in this area.

[1]  Anna Korhonen,et al.  Unsupervised Metaphor Paraphrasing using a Vector Space Model , 2012, COLING.

[2]  James W. Manns METAPHOR AND PARAPHRASE , 1975 .

[3]  Luis Alfonso Ureña López,et al.  Language technologies applied to document simplification for helping autistic people , 2015, Expert Syst. Appl..

[4]  Jean Maillard,et al.  Black Holes and White Rabbits: Metaphor Identification with Visual Features , 2016, NAACL.

[5]  Simone Teufel,et al.  Metaphor Corpus Annotated for Source - Target Domain Mappings , 2010, LREC.

[6]  G. Lakoff,et al.  Metaphors We Live by , 1982 .

[7]  L. Cameron Metaphor in Educational Discourse , 2003 .

[8]  Jean Véronis,et al.  EXTRACTING KNOWLEDGE BASES FROM MACHINE- READABLE DICTIONARIES : HAVE WE WASTED OUR TIME? , 1999 .

[9]  Iryna Gurevych,et al.  Wiktionary: a new rival for expert-built lexicons? Exploring the possibilities of collaborative lexicography , 2012 .

[10]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[11]  Shalom Lappin,et al.  Predicting Human Metaphor Paraphrase Judgments with Deep Neural Networks , 2018, Fig-Lang@NAACL-HLT.

[12]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[13]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[14]  Joachim Bingel,et al.  Lexi: A tool for adaptive, personalized text simplification , 2018, COLING.

[15]  Danushka Bollegala,et al.  Metaphor Interpretation Using Paraphrases Extracted from the Web , 2013, PloS one.

[16]  Magdalena Wolska,et al.  Simplifying metaphorical language for young readers: A corpus study on news text , 2017, BEA@EMNLP.

[17]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[18]  Paul Buitelaar,et al.  Crowd-Sourcing A High-Quality Dataset for Metaphor Identification in Tweets , 2019, LDK.

[19]  Ekaterina Shutova,et al.  Automatic Metaphor Interpretation as a Paraphrasing Task , 2010, NAACL.

[20]  James H. Martin A Computational Model of Metaphor Interpretation , 1990 .

[21]  Eneko Agirre,et al.  Knowledge Sources for WSD , 2007 .

[22]  David Crystal,et al.  A dictionary of linguistics and phonetics , 1997 .

[23]  Saif Mohammad,et al.  Metaphor as a Medium for Emotion: An Empirical Study , 2016, *SEMEVAL.

[24]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[25]  Ekaterina Shutova,et al.  Design and Evaluation of Metaphor Processing Systems , 2015, CL.

[26]  Jeremy H. Clear,et al.  The British national corpus , 1993 .

[27]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.