Learn - Filter - Apply - Forget. Mixed Approaches to Named Entity Recognition

We have explored and implemented different approaches to named entity recognition in German, a difficult task in this language since both regular nouns and proper names are capitalized. Our goal is to identify and recognise person names, geographical names and company names in a computer magazine corpus. Our geographical name classifier works with precompiled lists but our company name classifier learns the names from the corpus. For the recognition of person names we work with a precompiled list of first names and the program learns the last names. For this classifier we suggest setting an activation value for the last name and subsequently depriming the value until OforgettingO the name. Our evaluation results show that our mixed approaches are as good as the recall and precision values reported for English. It is shown that a carefully tuned cascade of name classifiers can even distinguish between different interpretations of a name token within the same document.