Applying a deep learning-based sequence labeling approach to detect attributes of medical concepts in clinical text

To detect attributes of medical concepts in clinical text, a traditional method often consists of two steps: named entity recognition of attributes and then relation classification between medical concepts and attributes. Here we present a novel solution, in which attribute detection of given concepts is converted into a sequence labeling problem, thus attribute entity recognition and relation classification are done simultaneously within one step. A neural architecture combining bidirectional Long Short-Term Memory networks and Conditional Random fields (Bi-LSTMs-CRF) was adopted to detect various medical concept-attribute pairs in an efficient way. We then compared our deep learning-based sequence labeling approach with traditional two-step systems for three different attribute detection tasks: disease-modifier, medication-signature, and lab test-value. Our results show that the proposed method achieved higher accuracy than the traditional methods for all three medical concept-attribute detection tasks. This study demonstrates the efficacy of our sequence labeling approach using Bi-LSTM-CRFs on the attribute detection task, indicating its potential to speed up practical clinical NLP applications.

[1]  Sunghwan Sohn,et al.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications , 2010, J. Am. Medical Informatics Assoc..

[2]  Özlem Uzuner,et al.  Extracting medication information from clinical text , 2010, J. Am. Medical Informatics Assoc..

[3]  Jun Xu,et al.  Clinical Named Entity Recognition Using Deep Learning Models , 2017, AMIA.

[4]  Hilde van der Togt,et al.  Publisher's Note , 2003, J. Netw. Comput. Appl..

[5]  Dong Wang,et al.  Relation Classification via Recurrent Neural Network , 2015, ArXiv.

[6]  Jun Xu,et al.  UTH_CCB System for Adverse Drug Reaction Extraction from Drug Labels at TAC-ADR 2017 , 2017, TAC.

[7]  Xiaolong Wang,et al.  CNN-based ranking for biomedical entity normalization , 2017, BMC Bioinformatics.

[8]  Chandra Bhagavatula,et al.  Semi-supervised sequence tagging with bidirectional language models , 2017, ACL.

[9]  Vincent Ng,et al.  Sieve-Based Entity Linking for the Biomedical Domain , 2015, ACL.

[10]  Shuying Shen,et al.  2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text , 2011, J. Am. Medical Informatics Assoc..

[11]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[12]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[13]  Fei Xia,et al.  Community annotation experiment for ground truth generation for the i2b2 medication challenge , 2010, J. Am. Medical Informatics Assoc..

[14]  Son Doan,et al.  Application of information technology: MedEx: a medication information extraction system for clinical narratives , 2010, J. Am. Medical Informatics Assoc..

[15]  Wendy W. Chapman,et al.  A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries , 2001, J. Biomed. Informatics.

[16]  Suresh Manandhar,et al.  SemEval-2014 Task 7: Analysis of Clinical Text , 2014, *SEMEVAL.

[17]  George Hripcsak,et al.  Extracting Structured Medication Event Information from Discharge Summaries , 2008, AMIA.

[18]  D A Evans,et al.  Automating concept identification in the electronic medical record: an experiment in extracting dosage information. , 1996, Proceedings : a conference of the American Medical Informatics Association. AMIA Fall Symposium.

[19]  Carol Friedman,et al.  Research Paper: A General Natural-language Text Processor for Clinical Radiology , 1994, J. Am. Medical Informatics Assoc..

[20]  Yaoyun Zhang,et al.  UTH-CCB: The Participation of the SemEval 2015 Challenge – Task 14 , 2015, *SEMEVAL.

[21]  Min Li,et al.  High accuracy information extraction of medication information from clinical notes: 2009 i2b2 medication extraction challenge , 2010, J. Am. Medical Informatics Assoc..

[22]  Amrish Patel,et al.  ezDI: A Supervised NLP System for Clinical Narrative Analysis , 2015, *SEMEVAL.

[23]  Wendy W. Chapman,et al.  ConText: An algorithm for determining negation, experiencer, and temporal status from clinical reports , 2009, J. Biomed. Informatics.