论文信息 - Fine-grained Dutch named entity recognition

Fine-grained Dutch named entity recognition

This paper describes the creation of a fine-grained named entity annotation scheme and corpus for Dutch, and experiments on automatic main type and subtype named entity recognition. We give an overview of existing named entity annotation schemes, and motivate our own, which describes six main types (persons, organizations, locations, products, events and miscellaneous named entities) and finer-grained information on subtypes and metonymic usage. This was applied to a one-million-word subset of the Dutch SoNaR reference corpus. The classifier for main type named entities achieves a micro-averaged F-score of 84.91 %, and is publicly available, along with the corpus and annotations.

Véronique Hoste | Bart Desmet | Veronique Hoste | Bart Desmet

[1] Trevor Hastie,et al. The Elements of Statistical Learning , 2001 .

[2] Tiejun Zhao,et al. Biomedical Named Entity Recognition Based on Classifiers Ensemble , 2008, Int. J. Comput. Sci. Appl..

[3] Mirella Lapata,et al. Ensemble Methods for Unsupervised WSD , 2006, ACL.

[4] Darrell Whitley,et al. A genetic algorithm tutorial , 1994, Statistics and Computing.

[5] Hideki Isozaki,et al. Efficient Support Vector Classifiers for Named Entity Recognition , 2002, COLING.

[6] Malvina Nissim,et al. Learning to buy a Renault and talk to BMW: A supervised approach to conventional metonymy , 2005 .

[7] Nancy A. Chinchor,et al. Overview of MUC-7 , 1998, MUC.

[8] Walter Daelemans,et al. TiMBL: Tilburg Memory-Based Learner, version 2.0, Reference guide , 1998 .

[9] Bogdan Babych,et al. Improving Machine Translation Quality with Automatic Named Entity Recognition , 2003, Proceedings of the 7th International EAMT workshop on MT and other Language Technology Tools, Improving MT through other Language Technology Tools Resources and Tools for Building MT - EAMT '03.

[10] Walter Daelemans,et al. An efficient memory-based morphosyntactic tagger and parser for Dutch , 2007, CLIN 2007.

[11] Veronique Hoste,et al. Optimization issues in machine learning of coreference resolution , 2005 .

[12] Satoshi Sekine,et al. A survey of named entity recognition and classification , 2007 .

[13] Gerhard Weikum,et al. YAGO2: A Spatially and Temporally Enhanced Knowledge Base from Wikipedia: Extended Abstract , 2013, IJCAI.

[14] Alexander S. Yeh,et al. More accurate tests for the statistical significance of result differences , 2000, COLING.

[15] Christoph Müller,et al. Multi-level annotation of linguistic data with MMAX 2 , 2006 .

[16] Daniel S. Weld,et al. Fine-Grained Entity Recognition , 2012, AAAI.

[17] Walter Daelemans,et al. Evaluation of Machine Learning Methods for Natural Language Processing Tasks , 2002, LREC.

[18] Satoshi Sekine,et al. Definition, Dictionaries and Tagger for Extended Named Entity Hierarchy , 2004, LREC.

[19] Michael Fleischman. Automated Subcategorization of Named Entities , 2001, ACL.

[20] Erik F. Tjong Kim Sang,et al. Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition , 2002, CoNLL.

[21] Nelleke Oostdijk,et al. From D-Coi to SoNaR: a reference corpus for Dutch , 2008, LREC.