The UET-CAM System in the BioCreAtIvE V CDR Task

In this paper, we describe a system developed for the BioCreative V chemical-disease relation (CDR) task. The Disease Named Entity Recognition and Normalization (DNER) model employs joint learning using a perceptronbased named entity recognizer (NER) and a back-off model for named entity normalization (NEN). In order to maximize both precision and recall, our NEN adopts a sequential back-off ensemble approach based on Semantic Supervised Indexing (SSI) a supervised Word Embedding (WE) methodgiving results by inference from training data, and Skip-gram an unsupervised WE methodtaking advantage of large unlabeled data. In the Chemical-induced diseases relation extraction (CID) model, we firstly resolve co-references by using a multipass sieve to identify cross-sentence references for entities, thus enabling intrasentence relations to be discovered more easily. Following this we extract CID relations using a support vector machine model trained on supervised sentence data from the CDR training and development dataset. We evaluated our method on both the DNER test set and the CID test set. Results show an F1 of 76.44 for the DNER task, and a best performance of 51.6 on the CID task using the multipass sieve.