Biomedical relation extraction via knowledge-enhanced reading comprehension

Background In biomedical research, chemical and disease relation extraction from unstructured biomedical literature is an essential task. Effective context understanding and knowledge integration are two main research problems in this task. Most work of relation extraction focuses on classification for entity mention pairs. Inspired by the effectiveness of machine reading comprehension (RC) in the respect of context understanding, solving biomedical relation extraction with the RC framework at both intra-sentential and inter-sentential levels is a new topic worthy to be explored. Except for the unstructured biomedical text, many structured knowledge bases (KBs) provide valuable guidance for biomedical relation extraction. Utilizing knowledge in the RC framework is also worthy to be investigated. We propose a knowledge-enhanced reading comprehension (KRC) framework to leverage reading comprehension and prior knowledge for biomedical relation extraction. First, we generate questions for each relation, which reformulates the relation extraction task to a question answering task. Second, based on the RC framework, we integrate knowledge representation through an efficient knowledge-enhanced attention interaction mechanism to guide the biomedical relation extraction. Results The proposed model was evaluated on the BioCreative V CDR dataset and CHR dataset. Experiments show that our model achieved a competitive document-level F1 of 71.18% and 93.3%, respectively, compared with other methods. Conclusion Result analysis reveals that open-domain reading comprehension data and knowledge representation can help improve biomedical relation extraction in our proposed KRC framework. Our work can encourage more research on bridging reading comprehension and biomedical relation extraction and promote the biomedical relation extraction.

[1]  Mingxin Zhou,et al.  Entity-Relation Extraction as Multi-Turn Question Answering , 2019, ACL.

[2]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[3]  Bo Xu,et al.  A document level neural model integrated domain knowledge for chemical-induced disease relations , 2018, BMC Bioinformatics.

[4]  Yaoyun Zhang,et al.  CD-REST: a system for extracting chemical-induced disease relation in literature , 2016, Database J. Biol. Databases Curation.

[5]  Hongfei Lin,et al.  CIDExtractor: A chemical-induced disease relation extraction system for biomedical literature , 2016, 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[6]  Ming Yang,et al.  Chemical-induced disease extraction via recurrent piecewise convolutional neural networks , 2018, BMC Medical Informatics and Decision Making.

[7]  Richard Socher,et al.  The Natural Language Decathlon: Multitask Learning as Question Answering , 2018, ArXiv.

[8]  Nigel Collier,et al.  Improving chemical-induced disease relation extraction with learned features based on convolutional neural network , 2017, 2017 9th International Conference on Knowledge and Systems Engineering (KSE).

[9]  Shixian Ning,et al.  Knowledge-guided convolutional networks for chemical-disease relation extraction , 2019, BMC Bioinformatics.

[10]  Shixian Ning,et al.  Chemical-induced disease relation extraction with dependency information and prior knowledge , 2018, J. Biomed. Informatics.

[11]  Howard L. Bleich,et al.  Technical Milestone: Medical Subject Headings Used to Search the Biomedical Literature , 2001, J. Am. Medical Informatics Assoc..

[12]  Jiwei Li,et al.  A Unified MRC Framework for Named Entity Recognition , 2019, ACL.

[13]  Guodong Zhou,et al.  Chemical-induced disease relation extraction with various linguistic features , 2016, Database J. Biol. Databases Curation.

[14]  Yang Xiang,et al.  Exploiting sequence labeling framework to extract document-level relations from biomedical texts , 2020, BMC Bioinformatics.

[15]  Yifan Peng,et al.  Improving chemical disease relation extraction with rich features and weakly labeled data , 2016, Journal of Cheminformatics.

[16]  Jaewoo Kang,et al.  Pre-trained Language Model for Biomedical Question Answering , 2019, PKDD/ECML Workshops.

[17]  Joshua B. Tenenbaum,et al.  Separating Style and Content with Bilinear Models , 2000, Neural Computation.

[18]  Bo Xu,et al.  An effective neural model extracting document level chemical-induced disease relations from biomedical literature , 2018, J. Biomed. Informatics.

[19]  Omer Levy,et al.  Zero-Shot Relation Extraction via Reading Comprehension , 2017, CoNLL.

[20]  Karin M. Verspoor,et al.  Convolutional neural networks for chemical-disease relation extraction are improved with character-based word embeddings , 2018, BioNLP.

[21]  Kotagiri Ramamohanarao,et al.  Exploiting graph kernels for high performance biomedical relation extraction , 2018, Journal of Biomedical Semantics.

[22]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[23]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[24]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[25]  Erik M. van Mulligen,et al.  Extraction of chemical-induced diseases using prior knowledge and textual information , 2016, Database J. Biol. Databases Curation.

[26]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[27]  Andrew McCallum,et al.  Simultaneously Self-Attending to All Mentions for Full-Abstract Biological Relation Extraction , 2018, NAACL.

[28]  Thomas C. Wiegers,et al.  The Comparative Toxicogenomics Database: update 2019 , 2018, Nucleic Acids Res..

[29]  Sophia Ananiadou,et al.  Inter-sentence Relation Extraction with Document-level Graph Convolutional Neural Network , 2019, ACL.