A Mixed Approach in Recognising Geographical Entities in Texts

The paper describes an approach for automatic identification in Romanian texts of name entities belonging to the geographical domain. The research is part of a project (MappingBooks) aimed to link mentions of entities in an e-book with external information, as found in social media, Wikipedia, or web pages containing cultural or touristic information, in order to enhance the reader’s experience. The described name entity recognizer mixes ontological information, as found in public resources, with handwritten symbolic rules. The outputs of the two component modules are compared and heuristics are used to take decisions in cases of conflict.

[1]  Kalina Bontcheva,et al.  Architectural elements of language engineering robustness , 2002, Natural Language Engineering.

[2]  David Yarowsky,et al.  Unsupervised Personal Name Disambiguation , 2003, CoNLL.

[3]  Ralph Grishman,et al.  A Decision Tree Method for Finding and Classifying Names in Japanese Texts , 1998, VLC@COLING/ACL.

[4]  Cheng Niu,et al.  InfoXtract location normalization: a hybrid approach to geographic references in information extraction , 2003, HLT-NAACL 2003.

[5]  Yuji Matsumoto,et al.  Japanese Named Entity Extraction with Redundant Morphological Analysis , 2003, NAACL.

[6]  A. Waibel,et al.  Multilingual named entity extraction and translation from text and speech , 2006 .

[7]  Daniela Gîfu,et al.  Quo Vadis: A Corpus of Entities and Relations , 2015 .

[8]  Ralph Grishman,et al.  Unsupervised Learning of Generalized Names , 2002, COLING.

[9]  Marc Moens,et al.  Description of the LTG System Used for MUC-7 , 1998, MUC.

[10]  Ralph Grishman,et al.  Exploiting Diverse Knowledge Sources via Maximum Entropy in Named Entity Recognition , 1998, VLC@COLING/ACL.

[11]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[12]  Pascale Pung,et al.  A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora , 1995, ACL 1995.

[13]  Menno-Jan Kraak,et al.  Principles of hypermaps , 1997 .

[14]  Tong Zhang,et al.  Named Entity Recognition through Classifier Combination , 2003, CoNLL.

[15]  Gideon S. Mann,et al.  Bootstrapping toponym classifiers , 2003, HLT-NAACL 2003.

[16]  Peter D. Turney,et al.  A Supervised Learning Approach to Acronym Identification , 2005, Canadian AI.

[17]  Pascale Fung,et al.  A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora , 1995, ACL.

[18]  Yoram Singer,et al.  Unsupervised Models for Named Entity Classification , 1999, EMNLP.

[19]  Gary Geunbae Lee,et al.  A Bootstrapping Approach for Geographic Named Entity Annotation , 2004, AIRS.

[20]  Ralph Grishman,et al.  Message Understanding Conference- 6: A Brief History , 1996, COLING.

[21]  Daniel Alexandru Anechitei,et al.  MultiDPS – A multilingual Discourse Processing System , 2014, COLING.

[22]  Yorick Wilks,et al.  Named Entity Recognition from Diverse Text Types , 2001 .