Extraction of Information Related to Adverse Drug Events from Electronic Health Record Notes: Design of an End-to-End Model Based on Deep Learning

Background Pharmacovigilance and drug-safety surveillance are crucial for monitoring adverse drug events (ADEs), but the main ADE-reporting systems such as Food and Drug Administration Adverse Event Reporting System face challenges such as underreporting. Therefore, as complementary surveillance, data on ADEs are extracted from electronic health record (EHR) notes via natural language processing (NLP). As NLP develops, many up-to-date machine-learning techniques are introduced in this field, such as deep learning and multi-task learning (MTL). However, only a few studies have focused on employing such techniques to extract ADEs. Objective We aimed to design a deep learning model for extracting ADEs and related information such as medications and indications. Since extraction of ADE-related information includes two steps—named entity recognition and relation extraction—our second objective was to improve the deep learning model using multi-task learning between the two steps. Methods We employed the dataset from the Medication, Indication and Adverse Drug Events (MADE) 1.0 challenge to train and test our models. This dataset consists of 1089 EHR notes of cancer patients and includes 9 entity types such as Medication, Indication, and ADE and 7 types of relations between these entities. To extract information from the dataset, we proposed a deep-learning model that uses a bidirectional long short-term memory (BiLSTM) conditional random field network to recognize entities and a BiLSTM-Attention network to extract relations. To further improve the deep-learning model, we employed three typical MTL methods, namely, hard parameter sharing, parameter regularization, and task relation learning, to build three MTL models, called HardMTL, RegMTL, and LearnMTL, respectively. Results Since extraction of ADE-related information is a two-step task, the result of the second step (ie, relation extraction) was used to compare all models. We used microaveraged precision, recall, and F1 as evaluation metrics. Our deep learning model achieved state-of-the-art results (F1=65.9%), which is significantly higher than that (F1=61.7%) of the best system in the MADE1.0 challenge. HardMTL further improved the F1 by 0.8%, boosting the F1 to 66.7%, whereas RegMTL and LearnMTL failed to boost the performance. Conclusions Deep learning models can significantly improve the performance of ADE-related information extraction. MTL may be effective for named entity recognition and relation extraction, but it depends on the methods, data, and other factors. Our results can facilitate research on ADE detection, NLP, and machine learning.

[1]  Andrew McCallum,et al.  Simultaneously Self-Attending to All Mentions for Full-Abstract Biological Relation Extraction , 2018, NAACL.

[2]  Hong Yu,et al.  Automatically Recognizing Medication and Adverse Event Information From Food and Drug Administration’s Adverse Event Reporting System Narratives , 2014, JMIR medical informatics.

[3]  Gary Null,et al.  Death by Medicine , 2010 .

[4]  William Vaughan Adverse drug event reporting. , 2012, Health affairs.

[5]  John F Hurdle,et al.  High rates of adverse drug events in a highly computerized hospital. , 2005, Archives of internal medicine.

[6]  Bharath Dandala,et al.  IBM Research System at MADE 2018: Detecting Adverse Drug Events from Electronic Health Records , 2018, Medication and Adverse Drug Event Detection.

[7]  Martial Hebert,et al.  Cross-Stitch Networks for Multi-task Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  D. Bates,et al.  Incidence and preventability of adverse drug events among older persons in the ambulatory setting. , 2003, JAMA.

[9]  Claire Cardie,et al.  Multinomial Adversarial Networks for Multi-Domain Text Classification , 2018, NAACL.

[10]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[11]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[12]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[13]  M. Makary,et al.  Medical error—the third leading cause of death in the US , 2016, British Medical Journal.

[14]  D. Classen,et al.  Adverse drug events in hospitalized patients. Excess length of stay, extra costs, and attributable mortality. , 1997, JAMA.

[15]  Yu Cheng,et al.  Segment convolutional neural networks (Seg-CNNs) for classifying relations in clinical notes , 2018, J. Am. Medical Informatics Assoc..

[16]  Zhiyong Lu,et al.  NCBI disease corpus: A resource for disease name recognition and concept normalization , 2014, J. Biomed. Informatics.

[17]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[18]  Subashan Perera,et al.  Research Paper: A Systematic Review of the Performance Characteristics of Clinical Event Monitor Signals Used to Detect Adverse Drug Events in the Hospital Setting , 2007, J. Am. Medical Informatics Assoc..

[19]  Son Doan,et al.  Application of information technology: MedEx: a medication information extraction system for clinical narratives , 2010, J. Am. Medical Informatics Assoc..

[20]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[21]  Jun Zhao,et al.  Relation Classification via Convolutional Deep Neural Network , 2014, COLING.

[22]  Daniel Hanisch,et al.  ProMiner: rule-based protein and gene entity recognition , 2005, BMC Bioinformatics.

[23]  Scott L. DuVall,et al.  Hybrid system for adverse drug event detection , 2018, Medication and Adverse Drug Event Detection.

[24]  F. Follath,et al.  Epidemiology of drug exposure and adverse drug reactions in two swiss departments of internal medicine. , 2000, British journal of clinical pharmacology.

[25]  Jun Hu,et al.  Determining molecular predictors of adverse drug reactions with causality analysis based on structure learning , 2014, J. Am. Medical Informatics Assoc..

[26]  Anita Burgun,et al.  Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help? , 2017, JMIR public health and surveillance.

[27]  Hong Yu,et al.  Clinical Relation Extraction Toward Drug Safety Surveillance Using Electronic Health Record Narratives: Classical Learning Versus Deep Learning , 2018, JMIR public health and surveillance.

[28]  Juliane Fluck,et al.  Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports , 2012, J. Biomed. Informatics.

[29]  Xiaolong Wang,et al.  Recognizing Disjoint Clinical Concepts in Clinical Text Using Machine Learning-based Methods , 2015, AMIA.

[30]  Shuying Shen,et al.  2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text , 2011, J. Am. Medical Informatics Assoc..

[31]  Yifan Peng,et al.  Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task , 2016, Database J. Biol. Databases Curation.

[32]  Rong Xu,et al.  Automatic construction of a large-scale and accurate drug-side-effect association knowledge base from biomedical literature , 2014, J. Biomed. Informatics.

[33]  Jun'ichi Tsujii,et al.  GENIA corpus - a semantically annotated corpus for bio-textmining , 2003, ISMB.

[34]  A. Burgun,et al.  Adverse Drug Reaction Identification and Extraction in Social Media: A Scoping Review , 2015, Journal of medical Internet research.

[35]  D. Bates,et al.  The Costs of Adverse Drug Events in Hospitalized Patients , 1997 .

[36]  Malvina Nissim,et al.  Exploring the boundaries: gene and protein identification in biomedical text , 2005, BMC Bioinformatics.

[37]  Jian Su,et al.  Recognizing Names in Biomedical Texts: a Machine Learning Approach , 2004 .

[38]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[39]  Thomas C. Wiegers,et al.  The Comparative Toxicogenomics Database's 10th year anniversary: update 2015 , 2014, Nucleic Acids Res..

[40]  Halil Kilicoglu,et al.  Sortal anaphora resolution to enhance relation extraction from biomedical literature , 2016, BMC Bioinformatics.

[41]  José Luís Oliveira,et al.  Gimli: open source and high-performance biomedical name recognition , 2013, BMC Bioinformatics.

[42]  Jari Björne,et al.  End-to-End System for Bacteria Habitat Extraction , 2017, BioNLP.

[43]  Jeffrey Dean,et al.  Scalable and accurate deep learning with electronic health records , 2018, npj Digital Medicine.

[44]  Yaoyun Zhang,et al.  CD-REST: a system for extracting chemical-induced disease relation in literature , 2016, Database J. Biol. Databases Curation.

[45]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[46]  S. Edlavitch,et al.  Adverse drug event reporting. Improving the low US reporting rates. , 1988, Archives of internal medicine.

[47]  Atul Kumar,et al.  The Newly Available FAERS Public Dashboard: Implications for Health Care Professionals , 2018, Hospital pharmacy.

[48]  David W Bates,et al.  Participation in an ambulatory e‐pharmacovigilance system , 2010, Pharmacoepidemiology and drug safety.

[49]  Chun-Nan Hsu,et al.  Integrating high dimensional bi-directional parsing models for gene mention tagging , 2008, ISMB.

[50]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[51]  B Begaud,et al.  False-positives in spontaneous reporting: should we worry about them? , 1994, British journal of clinical pharmacology.

[52]  N. Laird,et al.  Incidence of Adverse Drug Events and Potential Adverse Drug Events: Implications for Prevention , 1995 .

[53]  S D Small,et al.  The costs of adverse drug events in hospitalized patients. Adverse Drug Events Prevention Study Group. , 1998, JAMA.

[54]  Charlene R. Weir,et al.  Critical Gaps in the World's Largest Electronic Medical Record: Ad Hoc Nursing Narratives and Invisible Adverse Drug Events , 2003, AMIA.

[55]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[56]  Yue Zhang,et al.  Design Challenges and Misconceptions in Neural Sequence Labeling , 2018, COLING.

[57]  Hong Yu,et al.  Bidirectional RNN for Medical Event Detection in Electronic Health Records , 2016, NAACL.

[58]  Marie-Francine Moens,et al.  Structured learning for spatial information extraction from biomedical text: bacteria biotopes , 2015, BMC Bioinformatics.

[59]  Joachim Bingel,et al.  Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics , 2016 .

[60]  Hong Yu,et al.  Overview of the First Natural Language Processing Challenge for Extracting Medication, Indication, and Adverse Drug Events from Electronic Health Record Notes (MADE 1.0) , 2019, Drug Safety.

[61]  Sanna Salanterä,et al.  Overview of the ShARe/CLEF eHealth Evaluation Lab 2013 , 2013, CLEF.

[62]  Andreas Holzinger,et al.  Knowledge discovery of drug data on the example of adverse reaction prediction , 2014, BMC Bioinformatics.

[63]  Ralf Zimmer,et al.  RelEx - Relation extraction using dependency parse trees , 2007, Bioinform..

[64]  P. Bork,et al.  A side effect resource to capture phenotypic effects of drugs , 2010, Molecular systems biology.

[65]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[66]  Pierre Zweigenbaum,et al.  The contribution of co-reference resolution to supervised relation detection between bacteria and biotopes entities , 2015, BMC Bioinformatics.

[67]  Trevor Cohn,et al.  Low Resource Dependency Parsing: Cross-lingual Parameter Sharing in a Neural Network Parser , 2015, ACL.

[68]  S D Small,et al.  Incidence of adverse drug events and potential adverse drug events. Implications for prevention. ADE Prevention Study Group. , 1995, JAMA.

[69]  Steven Bethard,et al.  UArizona at the MADE1.0 NLP Challenge , 2018, Medication and Adverse Drug Event Detection.

[70]  Makoto Miwa,et al.  End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures , 2016, ACL.