论文信息 - Extracting entities with attributes in clinical text via joint deep learning

Extracting entities with attributes in clinical text via joint deep learning

OBJECTIVE Extracting clinical entities and their attributes is a fundamental task of natural language processing (NLP) in the medical domain. This task is typically recognized as 2 sequential subtasks in a pipeline, clinical entity or attribute recognition followed by entity-attribute relation extraction. One problem of pipeline methods is that errors from entity recognition are unavoidably passed to relation extraction. We propose a novel joint deep learning method to recognize clinical entities or attributes and extract entity-attribute relations simultaneously. MATERIALS AND METHODS The proposed method integrates 2 state-of-the-art methods for named entity recognition and relation extraction, namely bidirectional long short-term memory with conditional random field and bidirectional long short-term memory, into a unified framework. In this method, relation constraints between clinical entities and attributes and weights of the 2 subtasks are also considered simultaneously. We compare the method with other related methods (ie, pipeline methods and other joint deep learning methods) on an existing English corpus from SemEval-2015 and a newly developed Chinese corpus. RESULTS Our proposed method achieves the best F1 of 74.46% on entity recognition and the best F1 of 50.21% on relation extraction on the English corpus, and 89.32% and 88.13% on the Chinese corpora, respectively, which outperform the other methods on both tasks. CONCLUSIONS The joint deep learning-based method could improve both entity recognition and relation extraction from clinical text in both English and Chinese, indicating that the approach is promising.

[1] Hua Xu,et al. Recognizing clinical entities in hospital discharge summaries using Structural Support Vector Machines with word representation features , 2013, BMC Medical Informatics and Decision Making.

[2] Yu Cheng,et al. Segment convolutional neural networks (Seg-CNNs) for classifying relations in clinical notes , 2018, J. Am. Medical Informatics Assoc..

[3] Carol Friedman,et al. Research Paper: A General Natural-language Text Processor for Clinical Radiology , 1994, J. Am. Medical Informatics Assoc..

[4] Ming Yang,et al. Entity recognition from clinical texts via recurrent neural network , 2017, BMC Medical Informatics and Decision Making.

[5] Joel D. Martin,et al. Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010 , 2011, J. Am. Medical Informatics Assoc..

[6] F. Wilcoxon. Individual Comparisons by Ranking Methods , 1945 .

[7] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[8] Sunil Kumar Sahu,et al. Drug-Drug Interaction Extraction from Biomedical Text Using Long Short Term Memory Network , 2017, J. Biomed. Informatics.

[9] Peter Szolovits,et al. MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[10] Bo Xu,et al. Joint entity and relation extraction based on a hybrid neural network , 2017, Neurocomputing.

[11] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[12] Yuan Luo,et al. Recurrent Neural Networks for Classifying Relations in Clinical Notes , 2017, AMIA.

[13] Shuying Shen,et al. 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text , 2011, J. Am. Medical Informatics Assoc..