Enriching Medcial Terminology Knowledge Bases via Pre-trained Language Model and Graph Convolutional Network

Enriching existing medical terminology knowledge bases (KBs) is an important and never-ending work for clinical research because new terminology alias may be continually added and standard terminologies may be newly renamed. In this paper, we propose a novel automatic terminology enriching approach to supplement a set of terminologies to KBs. Specifically, terminology and entity characters are first fed into pre-trained language model to obtain semantic embedding. The pre-trained model is used again to initialize the terminology and entity representations, then they are further embedded through graph convolutional network to gain structure embedding. Afterwards, both semantic and structure embeddings are combined to measure the relevancy between the terminology and the entity. Finally, the optimal alignment is achieved based on the order of relevancy between the terminology and all the entities in the KB. Experimental results on clinical indicator terminology KB, collected from 38 top-class hospitals of Shanghai Hospital Development Center, show that our proposed approach outperforms baseline methods and can effectively enrich the KB.

[1]  Yuzhong Qu,et al.  Multi-view Knowledge Graph Embedding for Entity Alignment , 2019, IJCAI.

[2]  Sibo Wang,et al.  Crowd-Based Deduplication: An Adaptive Approach , 2015, SIGMOD Conference.

[3]  C. McDonald,et al.  LOINC, a universal standard for identifying laboratory observations: a 5-year update. , 2003, Clinical chemistry.

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Tim Weninger,et al.  ProjE: Embedding Projection for Knowledge Graph Completion , 2016, AAAI.

[6]  Guoliang Li,et al.  Truth Inference in Crowdsourcing: Is the Problem Solved? , 2017, Proc. VLDB Endow..

[7]  Zhendong Mao,et al.  Knowledge Graph Embedding: A Survey of Approaches and Applications , 2017, IEEE Transactions on Knowledge and Data Engineering.

[8]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[9]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[10]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[11]  Qi Ye,et al.  An Effective Standardization Method for the Lab Indicators in Regional Medical Health Platform Using N-grams and Stacking , 2018, 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[12]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[13]  Wei Hu,et al.  Cross-Lingual Entity Alignment via Joint Attribute-Preserving Embedding , 2017, SEMWEB.

[14]  Jiuyang Tang,et al.  Iterative Entity Alignment with Improved Neural Attribute Embedding , 2019, DL4KG@ESWC.

[15]  Juan-Zi Li,et al.  Boosting Cross-Lingual Knowledge Linking via Concept Annotation , 2013, IJCAI.

[16]  Kevin Donnelly,et al.  SNOMED-CT: The advanced terminology and coding system for eHealth. , 2006, Studies in health technology and informatics.

[17]  Steven Skiena,et al.  Co-training Embeddings of Knowledge Graphs and Entity Descriptions for Cross-lingual Entity Alignment , 2018, IJCAI.

[18]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[19]  Maosong Sun,et al.  ERNIE: Enhanced Language Representation with Informative Entities , 2019, ACL.

[20]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[21]  Zhichun Wang,et al.  Cross-lingual Knowledge Graph Alignment via Graph Convolutional Networks , 2018, EMNLP.

[22]  Carlo Zaniolo,et al.  Multilingual Knowledge Graph Embeddings for Cross-lingual Knowledge Alignment , 2016, IJCAI.

[23]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[24]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[25]  Serge Abiteboul,et al.  PARIS: Probabilistic Alignment of Relations, Instances, and Schema , 2011, Proc. VLDB Endow..

[26]  Ting Wang,et al.  Using a knowledge graph for hypernymy detection between Chinese symptoms , 2018, 2018 Tenth International Conference on Advanced Computational Intelligence (ICACI).

[27]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[28]  Rui Zhang,et al.  Entity Alignment between Knowledge Graphs Using Attribute Embeddings , 2019, AAAI.

[29]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[30]  Jun Zhao,et al.  A Joint Embedding Method for Entity Alignment of Knowledge Bases , 2016, CCKS.

[31]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[32]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[33]  Pasquale Minervini,et al.  Convolutional 2D Knowledge Graph Embeddings , 2017, AAAI.

[34]  Guoliang Li,et al.  Hike: A Hybrid Human-Machine Method for Entity Alignment in Large-Scale Knowledge Bases , 2017, CIKM.

[35]  Wei Hu,et al.  Bootstrapping Entity Alignment with Knowledge Graph Embedding , 2018, IJCAI.

[36]  Jun Zhao,et al.  Knowledge Graph Embedding via Dynamic Mapping Matrix , 2015, ACL.

[37]  Jeff Z. Pan,et al.  Effective Online Knowledge Graph Fusion , 2015, International Semantic Web Conference.

[38]  Max Welling,et al.  Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.

[39]  Zhiyuan Liu,et al.  Iterative Entity Alignment via Joint Knowledge Embeddings , 2017, IJCAI.