论文信息 - Modular Classifier Ensemble Architecture for Named Entity Recognition on Low Resource Systems

Modular Classifier Ensemble Architecture for Named Entity Recognition on Low Resource Systems

This paper presents the best performing Named Entity Recognition system in the GermEval 2014 Shared Task. Our approach combines semi-automatically created lexical resources with an ensemble of binary classifiers which extract the most likely tag sequence. Out-of-vocabulary words are tackled with semantic generalization extracted from a large corpus and an ensemble of part-of-speech taggers, one of which is unsupervised. Unknown candidate sequences are resolved using a look-up with the Wikipedia API.

Christian Hänig | Stefan Bordag

[1] Tim Leek,et al. Information Extraction Using Hidden Markov Models , 1997 .

[2] Ralph Grishman,et al. A Maximum Entropy Approach to Named Entity Recognition , 1999 .

[3] Erik F. Tjong Kim Sang,et al. Introduction to the CoNLL-2003 shared task , 2003 .

[4] Dan Klein,et al. Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[5] Richard M. Schwartz,et al. An Algorithm that Learns What's in a Name , 1999, Machine Learning.

[6] Christopher D. Manning,et al. Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[7] Manaal Faruqui,et al. Training and Evaluating a German Named Entity Recognizer with Semantic Generalization , 2010, KONVENS.

[8] Mark Johnson,et al. SVD and Clustering for Unsupervised POS Tagging , 2010, ACL.

[9] Pablo Gamallo,et al. Is singular value decomposition useful for word similarity extraction? , 2011, Lang. Resour. Evaluation.

[10] Maksim Tkatchenko,et al. Named entity recognition: Exploring features , 2012, KONVENS.

[11] Christian Biemann,et al. GermEval 2014 Named Entity Recognition Shared Task , 2014 .