Biomedical Knowledge Graphs Construction From Conditional Statements

Conditions play an essential role in biomedical statements. However, existing biomedical knowledge graphs (BioKGs) only focus on factual knowledge, organized as a flat relational network of biomedical concepts. These BioKGs ignore the conditions of the facts being valid, which loses essential contexts for knowledge exploration and inference. We consider both facts and their conditions in biomedical statements and proposed a three-layered information-lossless representation of BioKG. The first layer has biomedical concept nodes, attribute nodes. The second layer represents both biomedical fact and condition tuples by nodes of the relation phrases, connecting to the subject and object in the first layer. The third layer has nodes of statements connecting to a set of fact tuples and/or condition tuples in the second layer. We transform the BioKG construction problem into a sequence labeling problem based on a novel designed tag schema. We design a Multi-Input Multi-Output sequence labeling model (MIMO) that learns from multiple input signals and generates proper number of multiple output sequences for tuple extraction. Experiments on a newly constructed dataset show that MIMO outperforms the existing methods. Further case study demonstrates that the BioKGs constructed provide a good understanding of the biomedical statements.

[1]  David L. Miller The Nature of Scientific Statements , 1947, Philosophy of Science.

[2]  Wei Zheng,et al.  Leveraging Biomedical Resources in Bi-LSTM for Drug-Drug Interaction Extraction , 2018, IEEE Access.

[3]  Mohamed Yahya,et al.  ReNoun: Fact Extraction for Nominal Attributes , 2014, EMNLP.

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Le Song,et al.  Variational Reasoning for Question Answering with Knowledge Graph , 2017, AAAI.

[6]  Mari Ostendorf,et al.  Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction , 2018, EMNLP.

[7]  Pasquale Minervini,et al.  Convolutional 2D Knowledge Graph Embeddings , 2017, AAAI.

[8]  Ido Dagan,et al.  Supervised Open Information Extraction , 2018, NAACL.

[9]  Alexandre Allauzen,et al.  Non-lexical neural architecture for fine-grained POS Tagging , 2015, EMNLP.

[10]  Yiyu Shi,et al.  A Novel Unsupervised Approach for Precise Temporal Slot Filling from Incomplete and Noisy Temporal Contexts , 2019, WWW.

[11]  Jiawei Han,et al.  MetaPAD: Meta Pattern Discovery from Massive Text Corpora , 2017, KDD.

[12]  Mari Ostendorf,et al.  Scientific Information Extraction with Semi-supervised Neural Tagging , 2017, EMNLP.

[13]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[14]  Nitesh V. Chawla,et al.  The Role of "Condition": A Novel Scientific Knowledge Graph Representation and Construction Model , 2019, KDD.

[15]  Qingkai Zeng,et al.  Tablepedia: Automating PDF Table Reading in an Experimental Evidence Exploration and Analytic System , 2019, WWW.

[16]  Marcelo Fiszman,et al.  The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text , 2003, J. Biomed. Informatics.

[17]  Hongfang Liu,et al.  A Comparison of Word Embeddings for the Biomedical Natural Language Processing , 2018, J. Biomed. Informatics.

[18]  Jiawei Han,et al.  TruePIE: Discovering Reliable Patterns in Pattern-Based Information Extraction , 2018, KDD.

[19]  Christopher D. Manning,et al.  Leveraging Linguistic Structure For Open Domain Information Extraction , 2015, ACL.

[20]  Oren Etzioni,et al.  Open Language Learning for Information Extraction , 2012, EMNLP.

[21]  Erik M. van Mulligen,et al.  Using rule-based natural language processing to improve disease normalization in biomedical text , 2012, J. Am. Medical Informatics Assoc..

[22]  Peng Zhou,et al.  Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme , 2017, ACL.

[23]  Wenhan Xiong,et al.  DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning , 2017, EMNLP.

[24]  Behrang Q. Zadeh,et al.  SemEval-2018 Task 7: Semantic Relation Extraction and Classification in Scientific Papers , 2018, *SEMEVAL.

[25]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[26]  Tom M. Mitchell,et al.  Leveraging Knowledge Bases in LSTMs for Improving Machine Reading , 2017, ACL.

[27]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[28]  Yu Zhang,et al.  Open Information Extraction with Meta-pattern Discovery in Biomedical Literature , 2018, BCB.

[29]  Patrick Ernst,et al.  Biomedical knowledge base construction from text and its applications in knowledge-based systems , 2017 .

[30]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[31]  Ramakanth Kavuluru,et al.  Exploiting semantic patterns over biomedical knowledge graphs for predicting treatment and causative relations , 2018, J. Biomed. Informatics.

[32]  Jiawei Han,et al.  Automated Phrase Mining from Massive Text Corpora , 2017, IEEE Transactions on Knowledge and Data Engineering.

[33]  Wei Shi,et al.  Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification , 2016, ACL.

[34]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[35]  Hongfang Liu,et al.  Extracting chemical–protein relations using attention-based neural networks , 2018, Database J. Biol. Databases Curation.

[36]  Xuan Wang,et al.  Life-iNet: A Structured Network-Based Knowledge Exploration and Analytics System for Life Sciences , 2017, ACL.

[37]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[38]  Victor N. Tomilin,et al.  TRPV5/V6 Channels Mediate Ca2+ Influx in Jurkat T Cells Under the Control of Extracellular pH , 2016, Journal of cellular biochemistry.