A System for Recognizing Entities and Extracting Relations from Electronic Medical Records

Digging rich knowledge from clinical texts becomes a popular topic today. Knowledge graph has been widely used to integrate and manage abundant knowledge. Entity recognition and relation extraction play important roles in constructing knowledge graphs. In this paper, we develop a system to recognize entities and extract their relations from clinical texts in Electronic Medical Records. Our system implements four major functions: manual entity annotation, automatic entity recognition, manual relation annotation and automatic relation extraction. Tools of entity annotation and relation annotation are designed for professionals to help them manually annotate objects given original clinical texts. Moreover, entity recognition and relation recognition, which CRF and CNN are applied in, are accessible for professionals before manual annotation in order to increase the efficiency. Our system has been used in several applications, such as medical knowledge graph construction and health QA system.

[1]  Sunghwan Sohn,et al.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications , 2010, J. Am. Medical Informatics Assoc..

[2]  Chee Peng Lim,et al.  Enhancing medical named entity recognition with an extended segment representation technique , 2015, Comput. Methods Programs Biomed..

[3]  Dongsong Zhang,et al.  NLPIR: a Theoretical Framework for Applying Natural Language Processing to Information Retrieval , 2003, J. Assoc. Inf. Sci. Technol..

[4]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[5]  Bin Dong,et al.  Building a comprehensive syntactic and semantic corpus of Chinese clinical texts , 2016, J. Biomed. Informatics.

[6]  Ying Zhang,et al.  An Efficient Framework for Exact Set Similarity Search Using Tree Structure Indexes , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[7]  Xuezhong Zhou,et al.  Detection of Herb-Symptom Associations from Traditional Chinese Medicine Clinical Data , 2015, Evidence-based complementary and alternative medicine : eCAM.

[8]  Lise Getoor,et al.  Knowledge Graph Identification , 2013, SEMWEB.

[9]  Richard C Wasserman,et al.  Electronic medical records (EMRs), epidemiology, and epistemology: reflections on EMRs and future pediatric clinical research. , 2011, Academic pediatrics.

[10]  Jianyong Wang,et al.  AWETO: efficient incremental update and querying in rdf storage system , 2011, CIKM '11.

[11]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.