Dictionary construction and identification of possible adverse drug events in Danish clinical narrative text

Objective Drugs have tremendous potential to cure and relieve disease, but the risk of unintended effects is always present. Healthcare providers increasingly record data in electronic patient records (EPRs), in which we aim to identify possible adverse events (AEs) and, specifically, possible adverse drug events (ADEs). Materials and methods Based on the undesirable effects section from the summary of product characteristics (SPC) of 7446 drugs, we have built a Danish ADE dictionary. Starting from this dictionary we have developed a pipeline for identifying possible ADEs in unstructured clinical narrative text. We use a named entity recognition (NER) tagger to identify dictionary matches in the text and post-coordination rules to construct ADE compound terms. Finally, we apply post-processing rules and filters to handle, for example, negations and sentences about subjects other than the patient. Moreover, this method allows synonyms to be identified and anatomical location descriptions can be merged to allow appropriate grouping of effects in the same location. Results The method identified 1 970 731 (35 477 unique) possible ADEs in a large corpus of 6011 psychiatric hospital patient records. Validation was performed through manual inspection of possible ADEs, resulting in precision of 89% and recall of 75%. Discussion The presented dictionary-building method could be used to construct other ADE dictionaries. The complication of compound words in Germanic languages was addressed. Additionally, the synonym and anatomical location collapse improve the method. Conclusions The developed dictionary and method can be used to identify possible ADEs in Danish clinical narratives.

[1]  Scott T. Weiss,et al.  Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system , 2006, BMC Medical Informatics Decis. Mak..

[2]  N. Khardori,et al.  Timing of Specimen Collection for Blood Cultures from Febrile Patients with Bacteremia , 2008 .

[3]  Michael Kuhn,et al.  Reflect: augmented browsing for the life scientist , 2009, Nature Biotechnology.

[4]  Antje Chang,et al.  The BRENDA Tissue Ontology (BTO): the first all-integrating ontology of all organisms for enzyme sources , 2010, Nucleic Acids Res..

[5]  Ted Pedersen,et al.  Abbreviation and Acronym Disambiguation in Clinical Discourse , 2005, AMIA.

[6]  Carol Friedman,et al.  Research Paper: A General Natural-language Text Processor for Clinical Radiology , 1994, J. Am. Medical Informatics Assoc..

[7]  Alan J. Forster,et al.  Electronically screening discharge summaries for adverse medical events. , 2002, Journal of the American Medical Informatics Association : JAMIA.

[8]  Sunghwan Sohn,et al.  Drug side effect extraction from clinical narratives of psychiatry and psychology patients , 2011, J. Am. Medical Informatics Assoc..

[9]  John F. Hurdle,et al.  Extracting Information from Textual Documents in the Electronic Health Record: A Review of Recent Research , 2008, Yearbook of Medical Informatics.

[10]  Xiaoyan Wang,et al.  Selecting information in electronic health records for knowledge acquisition , 2010, J. Biomed. Informatics.

[11]  Marion Hippius,et al.  Adverse drug reactions in Germany: direct costs of internal medicine hospitalizations , 2011, Pharmacoepidemiology and drug safety.

[12]  D. Pollock,et al.  National surveillance of emergency department visits for outpatient adverse drug events in children and adolescents. , 2008, The Journal of pediatrics.

[13]  Maria Kvist,et al.  Rule-based Entity Recognition and Coverage of SNOMED CT in Swedish Clinical Text , 2012, LREC.

[14]  Son Doan,et al.  Application of information technology: MedEx: a medication information extraction system for clinical narratives , 2010, J. Am. Medical Informatics Assoc..

[15]  Eckhard Bick A Named Entity Recognizer for Danish , 2004, LREC.

[16]  Hongfang Liu,et al.  Disambiguating Ambiguous Biomedical Terms in Biomedical Narrative Text: An Unsupervised Method , 2001, J. Biomed. Informatics.

[17]  Søren Brunak,et al.  Using Electronic Patient Records to Discover Disease Correlations and Stratify Patient Cohorts , 2011, PLoS Comput. Biol..

[18]  David W. Bates,et al.  Research Paper: Using Computerized Data to Identify Adverse Drug Events in Outpatients , 2001, J. Am. Medical Informatics Assoc..

[19]  Florence T. Bourgeois,et al.  Adverse drug events in the outpatient setting: an 11‐year national analysis , 2010, Pharmacoepidemiology and drug safety.

[20]  D. Pollock,et al.  National surveillance of emergency department visits for outpatient adverse drug events. , 2006, JAMA.

[21]  E. Brown,et al.  The Medical Dictionary for Regulatory Activities (MedDRA) , 1999, Drug safety.

[22]  M. Pirmohamed,et al.  Adverse drug reactions as cause of admission to hospital: prospective analysis of 18 820 patients , 2004, BMJ : British Medical Journal.

[23]  L. Hazell,et al.  Under-Reporting of Adverse Drug Reactions , 2006, Drug safety.

[24]  Sunghwan Sohn,et al.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications , 2010, J. Am. Medical Informatics Assoc..

[25]  R. Mann,et al.  Pharmacovigilance (2nd ed.) , 2007 .

[26]  Eckhard Bick,et al.  Named Entity Recognition for the Mainland Scandinavian Languages , 2005, Lit. Linguistic Comput..