A Large Harvested Corpus of Location Metonymy

Metonymy is a figure of speech in which an entity is referred to by another related entity. The existing datasets of metonymy are either too small in size or lack sufficient coverage. We propose a new, labelled, high-quality corpus of location metonymy called WiMCor, which is large in size and has high coverage. The corpus is harvested semi-automatically from English Wikipedia. We use different labels of varying granularity to annotate the corpus. The corpus can directly be used for training and evaluating automatic metonymy resolution systems. We construct benchmarks for metonymy resolution, and evaluate baseline methods using the new corpus.

[1]  Malvina Nissim,et al.  Syntactic Features and Word Similarity for Supervised Metonymy Resolution , 2003, ACL.

[2]  Michael Strube,et al.  Local and Global Context for Supervised and Unsupervised Metonymy Resolution , 2012, EMNLP-CoNLL.

[3]  Sven Hartrumpf,et al.  On metonymy recognition for geographic IR , 2006, GIR.

[4]  Michael Strube,et al.  Combining Collocations, Lexical and Encyclopedic Knowledge for Metonymy Resolution , 2009, EMNLP.

[5]  Ali Farhadi,et al.  HellaSwag: Can a Machine Really Finish Your Sentence? , 2019, ACL.

[6]  A. Feinstein,et al.  High agreement but low kappa: II. Resolving the paradoxes. , 1990, Journal of clinical epidemiology.

[7]  Malvina Nissim,et al.  Metonymy Resolution as a Classification Task , 2002, EMNLP.

[8]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[9]  Philippe Langlais,et al.  Transforming Wikipedia into a Large-Scale Fine-Grained Entity Type Corpus , 2018, LREC.

[10]  Takahiro Wakao,et al.  Metonymy: Reassessment, Survey of Acceptability, and its Treatment in a Machine Translation System , 1992, ACL.

[11]  Philippe Langlais,et al.  WikiCoref: An English Coreference-annotated Corpus of Wikipedia Articles , 2016, LREC.

[12]  J. Carlin,et al.  Bias, prevalence and kappa. , 1993, Journal of clinical epidemiology.

[13]  David Stallard Two Kinds of Metonymy , 1993, ACL.

[14]  Frank Guerin,et al.  End-to-End Sequential Metaphor Identification Inspired by Linguistic Theories , 2019, ACL.

[15]  Malvina Nissim,et al.  SemEval-2007 Task 08: Metonymy Resolution at SemEval-2007 , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[16]  Thierry Poibeau UP13: Knowledge-poor Methods (Sometimes) Perform Poorly , 2007, SemEval@ACL.

[17]  G. Lakoff,et al.  Metaphors We Live by , 1982 .

[18]  Sanda M. Harabagiu Deriving Metonymic Coercions from WordNet , 1998, WordNet@ACL/COLING.

[19]  Malvina Nissim,et al.  Learning to buy a Renault and talk to BMW: A supervised approach to conventional metonymy , 2005 .

[20]  J. Littlemore Metonymy: Hidden Shortcuts in Language, Thought and Communication , 2015 .

[21]  Ming Zhou,et al.  EventWiki: A Knowledge Base of Major Events , 2018, LREC.

[22]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[23]  Nigel Collier,et al.  Vancouver Welcomes You! Minimalist Location Metonymy Resolution , 2017, ACL.

[24]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[25]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[26]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[27]  Dan Fass,et al.  met*: A Method for Discriminating Metonymy and Metaphor by Computer , 1991, CL.

[28]  Eyal Shnarch,et al.  Semantic Relatedness of Wikipedia Concepts - Benchmark Data and a Working Solution , 2018, LREC.

[29]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[30]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[31]  Günter Radden,et al.  Towards a Theory of Metonymy , 1999 .

[32]  A. Feinstein,et al.  High agreement but low kappa: I. The problems of two paradoxes. , 1990, Journal of clinical epidemiology.

[33]  Yulan He,et al.  Content-Based Conflict of Interest Detection on Wikipedia , 2018, LREC.

[34]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[35]  Ron Artstein,et al.  Survey Article: Inter-Coder Agreement for Computational Linguistics , 2008, CL.

[36]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[37]  Michael Strube,et al.  WikiNet: A Very Large Scale Multi-Lingual Concept Network , 2010, LREC.

[38]  Thierry Poibeau Dealing with Metonymic Readings of Named Entities , 2006, ArXiv.

[39]  Rada Mihalcea,et al.  Using Wikipedia for Automatic Word Sense Disambiguation , 2007, NAACL.

[40]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.