On metonymy recognition for geographic information retrieval

Metonymically used location names (toponyms) refer to other, related entities and thus possess a meaning different from their literal, geographic sense. Metonymic uses are to be treated differently to improve the performance of geographic information retrieval (GIR). Statistics on toponym senses show that 75.06% of all location names are used in their literal sense, 17.05% are used metonymically, and 7.89% have a mixed sense. This article presents a method for disambiguating location names in texts between literal and metonymic senses, based on shallow features. The evaluation of this method is two‐fold. First, we use a memory‐based learner (TiMBL) to train a classifier and determine standard evaluation measures such as F‐score and accuracy. The classifier achieved an F‐score of 0.842 and an accuracy of 0.846 for identifying toponym senses in a subset of the CoNLL (Conference on Natural Language Learning) data. Second, we perform retrieval experiments based on the GeoCLEF data (newspaper article corpus and queries) from 2005 and 2006. We compare searching location names in a database index containing both their literal and metonymic senses with searching in an index containing their literal senses only. Evaluation results indicate that removing metonymic senses from the index yields a higher mean average precision (MAP) for GIR. In total, we observed a significant gain in MAP: an increase from 0.0704 to 0.0715 MAP for the GeoCLEF 2005 data, and an increase from 0.1944 to 0.2100 MAP for the GeoCLEF 2006 data.

[1]  Sven Hartrumpf,et al.  The semantically based computer lexicon HaGenLex. Structure and technological environment , 2003 .

[2]  Fredric C. Gey,et al.  GeoCLEF 2008: the CLEF 2008 Cross-Language Geographic Information Retrieval Track Overview , 2008, CLEF.

[3]  Walter Daelemans,et al.  TiMBL: Tilburg Memory-Based Learner, version 2.0, Reference guide , 1998 .

[4]  Fredric C. Gey,et al.  GeoCLEF: the CLEF 2005 Cross-Language Geographic Information Retrieval Track , 2005, CLEF.

[5]  Masaki Murata,et al.  A Statistical Approach to the Processing of Metonymy , 2000, COLING.

[6]  Malvina Nissim,et al.  Towards a Corpus Annotated for Metonymies: the Case of Location Names , 2002, LREC.

[7]  Sven Hartrumpf,et al.  Hybrid disambiguation in natural language analysis , 2003 .

[8]  G. Lakoff,et al.  Metaphors We Live By , 1980 .

[9]  Malvina Nissim,et al.  Learning to buy a Renault and talk to BMW: A supervised approach to conventional metonymy , 2005 .

[10]  Sebastian Hammer,et al.  Zebra-User's Guide and Reference , 2005 .

[11]  Yannick Versley,et al.  Extracting spatial information : grounding , classifying and linking spatial expressions [ Extended Abstract ] , 2022 .

[12]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[13]  Yves Peirsman,et al.  Example-Based Metonymy Recognition for Proper Nouns , 2006, EACL.

[14]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[15]  Malvina Nissim,et al.  Metonymy Resolution as a Classification Task , 2002, EMNLP.

[16]  Udo Hahn,et al.  Understanding metonymies in discourse , 2002, Artif. Intell..

[17]  Sven Hartrumpf,et al.  Using Semantic Networks for Geographic Information Retrieval , 2005, CLEF.

[18]  Kalina Bontcheva,et al.  Towards a semantic extraction of named entities , 2003 .

[19]  Johannes Leveling,et al.  University of Hagen at GeoCLEF2006: Experiments with Metonymy Recognition in Documents , 2006, CLEF.