Resolution of Chemical Disease Relations with Diverse Features and Rules

This paper describes the system developed by Mayo Clinic team for the extraction of Chemical-Disease relations. We employed two approaches: a rule-based approach to extract relations within a single sentence and a machine learning approach that uses diverse set of features to extract relations from a single sentence as well as multiple sentences. We trained the machine learning approach and designed rules based on the 750 PubMed abstracts (500 from training and 250 development dataset) and used the remaining 250 PubMed abstracts from the development dataset for blind evaluation. The rule-based approach was able to achieve an Fscore of 41% when used with gold-standard named entity annotations on testing data and 31% when used with organizer provided named entity annotation tools while machine learning approach attain F-score of 65% for gold-standard named entity annotations and 42% when used with organizer provided named entity annotation tools.