PrTransH: Embedding Probabilistic Medical Knowledge from Real World EMR Data

This paper proposes an algorithm named as PrTransH to learn embedding vectors from real world EMR data based medical knowledge. The unique challenge in embedding medical knowledge graph from real world EMR data is that the uncertainty of knowledge triplets blurs the border between "correct triplet" and "wrong triplet", changing the fundamental assumption of many existing algorithms. To address the challenge, some enhancements are made to existing TransH algorithm, including: 1) involve probability of medical knowledge triplet into training objective; 2) replace the margin-based ranking loss with unified loss calculation considering both valid and corrupted triplets; 3) augment training data set with medical background knowledge. Verifications on real world EMR data based medical knowledge graph prove that PrTransH outperforms TransH in link prediction task. To the best of our survey, this paper is the first one to learn and verify knowledge embedding on probabilistic knowledge graphs.

[1]  Jianfeng Gao,et al.  Embedding Entities and Relations for Learning and Inference in Knowledge Bases , 2014, ICLR.

[2]  Buzhou Tang,et al.  KGDDS: A System for Drug-Drug Similarity Measure in Therapeutic Substitution based on Knowledge Graph Curation , 2019, Journal of Medical Systems.

[3]  Guillaume Bouchard,et al.  Complex Embeddings for Simple Link Prediction , 2016, ICML.

[4]  Jian-Bo Yang,et al.  Clinical Decision Support Systems: A Review on Knowledge Representation and Inference Under Uncertainties , 2008, Int. J. Comput. Intell. Syst..

[5]  David Sontag,et al.  Learning a Health Knowledge Graph from Electronic Medical Records , 2017, Scientific Reports.

[6]  Meng Wang,et al.  Safe Medicine Recommendation via Medical Knowledge Graph Embedding , 2017, ArXiv.

[7]  Bin He,et al.  EMR-based medical knowledge representation and inference via Markov random fields and distributed representation learning , 2017, Artif. Intell. Medicine.

[8]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[9]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[10]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[11]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[12]  Katherine E Henson,et al.  Risk of Suicide After Cancer Diagnosis in England , 2018, JAMA psychiatry.

[13]  Maosong Sun,et al.  ERNIE: Enhanced Language Representation with Informative Entities , 2019, ACL.