Extended Feature Set for Chemical Named Entity Recognition and Indexing

The BioCreative IV CHEMDNER Task provides participants with the opportunity to compare their methods for chemical named entity recognition (NER) and indexing in a controlled environment. We contributed to this task with our previous conditional random field based system [1] extended by a number of novel general and domain-specific features. For the latter, we used features derived from two existing chemical NER systems, ChemSpot [2] and OSCAR [3], as well as various external resources. In this paper, we describe our approach and present a detailed ablation study that underlines the positive effect of domain-specific features for chemical NER.