BOUN-ISIK Participation: An Unsupervised Approach for the Named Entity Normalization and Relation Extraction of Bacteria Biotopes

This paper presents our participation to the Bacteria Biotope Task of the BioNLP Shared Task 2019. Our participation includes two systems for the two subtasks of the Bacteria Biotope Task: the normalization of entities (BB-norm) and the identification of the relations between the entities given a biomedical text (BB-rel). For the normalization of entities, we utilized word embeddings and syntactic re-ranking. For the relation extraction task, pre-defined rules are used. Although both approaches are unsupervised, in the sense that they do not need any labeled data, they achieved promising results. Especially, for the BB-norm task, the results have shown that the proposed method performs as good as deep learning based methods, which require labeled data.

[1]  Arzucan Özgür,et al.  Bacteria Biotope Detection, Ontology-based Normalization, and Relation Extraction using Syntactic Rules , 2013, BioNLP@ACL.

[2]  Arzucan Özgür,et al.  Ontology-Based Categorization of Bacteria and Habitat Entities using Information Retrieval Techniques , 2016, BioNLP.

[3]  Jari Björne,et al.  TEES 2.1: Automated Annotation Scheme Learning in the BioNLP 2013 Shared Task , 2013, BioNLP@ACL.

[4]  Jari Björne,et al.  End-to-End System for Bacteria Habitat Extraction , 2017, BioNLP.

[5]  Núria Queralt-Rosinach,et al.  Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research , 2014, BMC Bioinformatics.

[6]  Arzucan Özgür,et al.  Detection and categorization of bacteria habitats using shallow linguistic analysis , 2015, BMC Bioinformatics.

[7]  César de Pablo-Sánchez,et al.  Using a shallow linguistic kernel for drug-drug interaction extraction , 2011, J. Biomed. Informatics.

[8]  Zhiyong Lu,et al.  DNorm: disease name normalization with pairwise learning to rank , 2013, Bioinform..

[9]  Sung-Pil Choi,et al.  Extraction of protein–protein interactions (PPIs) from the literature by deep convolutional neural networks with various feature embeddings , 2018, J. Inf. Sci..

[10]  K. Cohen,et al.  Overview of BioCreative II gene normalization , 2008, Genome Biology.

[11]  Hung-Yu Kao,et al.  Cross-species gene normalization by species inference , 2011, BMC Bioinformatics.

[12]  Jari Björne,et al.  A Graph Kernel for Protein-Protein Interaction Extraction , 2008, BioNLP.

[13]  Robert Bossy,et al.  BioNLP Shared Task 2011 - Bacteria Biotope , 2011, BioNLP@ACL.

[14]  Xiaolong Wang,et al.  CNN-based ranking for biomedical entity normalization , 2017, BMC Bioinformatics.

[15]  Haibin Liu,et al.  Extracting drug-drug interactions from literature using a rich feature-based linear kernel approach , 2015, AMIA.

[16]  Michael Schroeder,et al.  Inter-species normalization of gene mentions with GNAT , 2008, ECCB.

[17]  Zhiyong Lu,et al.  The gene normalization task in BioCreative III , 2011, BMC Bioinformatics.

[18]  Claudio Giuliano,et al.  Exploiting Shallow Linguistic Information for Relation Extraction from Biomedical Literature , 2006, EACL.

[19]  Louise Deléger,et al.  Overview of the Bacteria Biotope Task at BioNLP Shared Task 2016 , 2016, BioNLP.

[20]  Arzucan Özgür,et al.  Linking entities through an ontology using word embeddings and syntactic re-ranking , 2019, BMC Bioinformatics.

[21]  Sampo Pyysalo,et al.  How to Train good Word Embeddings for Biomedical NLP , 2016, BioNLP@ACL.

[22]  Louise Deléger,et al.  Bacteria Biotope at BioNLP Open Shared Tasks 2019 , 2019, EMNLP.

[23]  Steven J. M. Jones,et al.  VERSE: Event and Relation Extraction in the BioNLP 2016 Shared Task , 2016, BioNLP.

[24]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[25]  Philip S. Yu,et al.  A new method to measure the semantic similarity of GO terms , 2007, Bioinform..

[26]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[27]  Udo Hahn,et al.  High-performance gene name normalization with GENO , 2009, Bioinform..

[28]  Robert Bossy,et al.  Overview of the gene regulation network and the bacteria biotope tasks in BioNLP'13 shared task , 2015, BMC Bioinformatics.

[29]  Pierre Zweigenbaum,et al.  Representation of complex terms in a vector space structured by an ontology for a normalization task , 2017, BioNLP.