Training and Evaluating a German Named Entity Recognizer with Semantic Generalization

We present a freely available optimized Named Entity Recognizer (NER) for German. It alleviates the small size of available NER training corpora for German with distributional generalization features trained on large unlabelled corpora. We vary the size and source of the generalization corpus and find improvements of 6% F1 score (in-domain) and 9% (out-of-domain) over simple supervised training.