Named Entity Recognition with Combinations of Conditional Random Fields

The Gene Mention task is a Named Entity Recognition (NER) task for labeling gene and gene product names in biomedical text. To deal with acceptable alternatives additionally to the gold standard, we use combinations of Conditional Random Fields (CRF) together with a normalizing tagger. This process is followed by a postprocessing step including an acronym disambiguation based on Latent Semantic Analysis (LSA). For robust model selection we apply 50-fold Bootstrapping to obtain an average F-Score of 84.58 % on the trainingset and 86.33 % on the test set.