Extracting and standardizing medication information in clinical text – the MedEx-UIMA system

Extraction of medication information embedded in clinical text is important for research using electronic health records (EHRs). However, most of current medication information extraction systems identify drug and signature entities without mapping them to standard representation. In this study, we introduced the open source Java implementation of MedEx, an existing high-performance medication information extraction system, based on the Unstructured Information Management Architecture (UIMA) framework. In addition, we developed new encoding modules in the MedEx-UIMA system, which mapped an extracted drug name/dose/form to both generalized and specific RxNorm concepts and translated drug frequency information to ISO standard. We processed 826 documents by both systems and verified that MedEx-UIMA and MedEx (the Python version) performed similarly by comparing both results. Using two manually annotated test sets that contained 300 drug entries from medication list and 300 drug entries from narrative reports, the MedEx-UIMA system achieved F-measures of 98.5% and 97.5% respectively for encoding drug names to corresponding RxNorm generic drug ingredients, and F-measures of 85.4% and 88.1% respectively for mapping drug names/dose/form to the most specific RxNorm concepts. It also achieved an F-measure of 90.4% for normalizing frequency information to ISO standard. The open source MedEx-UIMA system is freely available online at http://code.google.com/p/medex-uima/.

[1]  Özlem Uzuner,et al.  Extracting medication information from clinical text , 2010, J. Am. Medical Informatics Assoc..

[2]  David L. Reich,et al.  Extraction and Mapping of Drug Names from Free Text to a Standardized Nomenclature , 2007, AMIA.

[3]  Min Li,et al.  High accuracy information extraction of medication information from clinical notes: 2009 i2b2 medication extraction challenge , 2010, J. Am. Medical Informatics Assoc..

[4]  Hong Yu,et al.  Lancet: a high precision medication event extraction system for clinical text , 2010, J. Am. Medical Informatics Assoc..

[5]  David A. Ferrucci,et al.  UIMA: an architectural approach to unstructured information processing in the corporate research environment , 2004, Natural Language Engineering.

[6]  P. Jaccard THE DISTRIBUTION OF THE FLORA IN THE ALPINE ZONE.1 , 1912 .

[7]  Geoff Gordon,et al.  Use of natural language programming to extract medication from unstructured electronic medical records. , 2007, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[8]  D. Roden,et al.  The Emerging Role of Electronic Medical Records in Pharmacogenomics , 2011, Clinical pharmacology and therapeutics.

[9]  D A Evans,et al.  Automating concept identification in the electronic medical record: an experiment in extracting dosage information. , 1996, Proceedings : a conference of the American Medical Informatics Association. AMIA Fall Symposium.

[10]  James Pustejovsky,et al.  TimeML: Robust Specification of Event and Temporal Expressions in Text , 2003, New Directions in Question Answering.

[11]  Domonkos Tikk,et al.  Improving textual medication extraction using combined conditional random fields and rule-based systems , 2010, J. Am. Medical Informatics Assoc..

[12]  Vasudevan Jagannathan,et al.  Assessment of commercial NLP engines for medication information extraction from dictated clinical notes , 2009, Int. J. Medical Informatics.

[13]  Son Doan,et al.  Application of information technology: MedEx: a medication information extraction system for clinical narratives , 2010, J. Am. Medical Informatics Assoc..

[14]  Son Doan,et al.  Integrating existing natural language processing tools for medication extraction from discharge summaries , 2010, J. Am. Medical Informatics Assoc..

[15]  Marylyn D. Ritchie,et al.  The use of a DNA biobank linked to electronic medical records to characterize pharmacogenomic predictors of tacrolimus dose requirement in kidney transplant recipients , 2012, Pharmacogenetics and genomics.

[16]  Hua Xu,et al.  Facilitating pharmacogenetic studies using electronic health records and natural-language processing: a case study of warfarin , 2011, J. Am. Medical Informatics Assoc..