TWO LEVEL SELF -SUPERVISED RELATION EXTRACTION FROM MEDLINE USING UMLS

The biomedical research literature is one among many other domains that hides a precious knowledge, and the biomedical community made an extensive use of this scientific literature to discover the facts of biomedical entities, such as disease, drugs,etc.MEDLINE is a huge database of biomedical research papers which remain a significantly underutilized source of biological information. Discovering the useful knowledge from such huge corpus leads to various problems related to the type of information such as the concepts related to the domain of texts and the semantic relationship associated with them. In this paper, we propose a Two-level model for Self-supervised relation extraction from MEDLINE using Unified Medical Language System (UMLS) Knowledge base. The model uses a Self-supervised Approach for Relation Extraction (RE) by constructing enhanced training examples using information from UMLS. The model shows a better result in comparison with current state of the art and naive approaches.

[1]  Thomas Tran,et al.  A Machine Learning Approach for Identifying Disease-Treatment Relations in Short Texts , 2011, IEEE Transactions on Knowledge and Data Engineering.

[2]  Rafael Berlanga Llavori,et al.  Towards the Discovery of Semantic Relations in Large Biomedical Annotated Corpora , 2011, 2011 22nd International Workshop on Database and Expert Systems Applications.

[3]  Daniel S. Weld,et al.  Learning 5000 Relational Extractors , 2010, ACL.

[4]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[5]  Mark Craven,et al.  Constructing Biological Knowledge Bases by Extracting Information from Text Sources , 1999, ISMB.

[6]  Andrew McCallum,et al.  Modeling Relations and Their Mentions without Labeled Text , 2010, ECML/PKDD.

[7]  Núria Queralt-Rosinach,et al.  Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research , 2014, BMC Bioinformatics.

[8]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[9]  Mark Stevenson,et al.  Self-supervised Relation Extraction Using UMLS , 2014, CLEF.

[10]  Martin Hofmann-Apitius,et al.  Weakly Labeled Corpora as Silver Standard for Drug-Drug and Protein-Protein Interaction , 2012, LREC 2012.

[11]  Philippe Thomas,et al.  Robust relationship extraction in the biomedical domain , 2015 .

[12]  Aida Bchir,et al.  Extraction of drug-disease relations from MEDLINE abstracts , 2013, 2013 World Congress on Computer and Information Technology (WCCIT).

[13]  Adel M. Alimi,et al.  An agent-based Knowledge Discovery from Databases applied in healthcare domain , 2013, 2013 International Conference on Advanced Logistics and Transport.

[14]  Ulf Leser,et al.  Learning Protein–Protein Interaction Extraction using Distant Supervision , 2011 .

[15]  Lin Yao,et al.  Relationship extraction from biomedical literature using Maximum Entropy based on rich features , 2010, 2010 International Conference on Machine Learning and Cybernetics.

[16]  Barbara Rosario,et al.  Classifying Semantic Relations in Bioscience Texts , 2004, ACL.

[17]  Xiaohua Hu,et al.  Relation extraction from biomedical literature with minimal supervision and grouping strategy , 2014, 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[18]  Mark Stevenson,et al.  Applying UMLS for Distantly Supervised Relation Detection , 2014, Louhi@EACL.