Extracting drug indication information from structured product labels using natural language processing

OBJECTIVE To extract drug indications from structured drug labels and represent the information using codes from standard medical terminologies. MATERIALS AND METHODS We used MetaMap and other publicly available resources to extract information from the indications section of drug labels. Drugs and indications were encoded by RxNorm and UMLS identifiers respectively. A sample was manually reviewed. We also compared the results with two independent information sources: National Drug File-Reference Terminology and the Semantic Medline project. RESULTS A total of 6797 drug labels were processed, resulting in 19 473 unique drug-indication pairs. Manual review of 298 most frequently prescribed drugs by seven physicians showed a recall of 0.95 and precision of 0.77. Inter-rater agreement (Fleiss κ) was 0.713. The precision of the subset of results corroborated by Semantic Medline extractions increased to 0.93. DISCUSSION Correlation of a patient's medical problems and drugs in an electronic health record has been used to improve data quality and reduce medication errors. Authoritative drug indication information is available from drug labels, but not in a format readily usable by computer applications. Our study shows that it is feasible to use publicly available natural language processing resources to extract drug indications from drug labels. The same method can be applied to other sections of the drug label-for example, adverse effects, contraindications. CONCLUSIONS It is feasible to use publicly available natural language processing tools to extract indication information from freely available drug labels. Named entity recognition sources (eg, MetaMap) provide reasonable recall. Combination with other data sources provides higher precision.

[1]  Daniel B. Hier,et al.  Using clinical decision support to maintain medication and problem lists A pilot study to yield higher patient safety , 2008, 2008 IEEE International Conference on Systems, Man and Cybernetics.

[2]  Gunther Schadow Assessing the Impact of HL7/FDA Structured Product Label (SPL) Content for Medication Knowledge Management , 2007, AMIA.

[3]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[4]  D. Blumenthal,et al.  The "meaningful use" regulation for electronic health records. , 2010, The New England journal of medicine.

[5]  Linas Simonaitis,et al.  Medication and Indication Linkage: A Practical Therapy for the Problem List? , 2008, AMIA.

[6]  Stuart J. Nelson,et al.  Normalized names for clinical drugs: RxNorm at 6 years , 2011, J. Am. Medical Informatics Assoc..

[7]  David Sarne,et al.  Computerized physician order entry of medications and clinical decision support can improve problem list documentation compliance , 2010, Int. J. Medical Informatics.

[8]  Christopher G. Chute,et al.  Analyzing categorical information in two publicly available drug terminologies: RxNorm and NDF-RT , 2010, J. Am. Medical Informatics Assoc..

[9]  Halil Kilicoglu,et al.  Medical Facts to Support Inferencing in Natural Language Processing , 2005, AMIA.

[10]  B. Everitt,et al.  Statistical methods for rates and proportions , 1973 .

[11]  Carol Friedman,et al.  Semantic reclassification of the UMLS concepts , 2008, Bioinform..

[12]  Mark Lee,et al.  Adequacy of representation of the National Drug File Reference Terminology Physiologic Effects reference hierarchy for commonly prescribed medications , 2003, AMIA.

[13]  Jon D Duke,et al.  ADESSA: A Real-Time Decision Support Service for Delivery of Semantically Coded Adverse Drug Event Data. , 2010, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[14]  Jon Duke,et al.  A quantitative analysis of adverse events and "overwarning" in drug labeling. , 2011, Archives of internal medicine.

[15]  David W. Bates,et al.  High-priority drug-drug interactions for use in electronic health records , 2012, J. Am. Medical Informatics Assoc..

[16]  Amar K. Das,et al.  Unsupervised Method for Automatic Construction of a Disease Dictionary from a Large Free Text Collection , 2008, AMIA.

[17]  Jonathan M. Teich,et al.  AMIA Position Paper: Clinical Decision Support in Electronic Prescribing: Recommendations and an Action Plan: Report of the Joint Clinical Decision Support Workgroup , 2005, J. Am. Medical Informatics Assoc..

[18]  Olivier Bodenreider,et al.  Aggregating UMLS Semantic Types for Reducing Conceptual Complexity , 2001, MedInfo.

[19]  Gunther Schadow HL7 Structured Product Labeling - Electronic Prescribing Information for Provider Order Entry Decision Support , 2005, AMIA.

[20]  Robyn Tamblyn,et al.  Assessing the accuracy of an inter-institutional automated patient-specific health problem list , 2010, BMC Medical Informatics Decis. Mak..

[21]  Jonathan M. Teich,et al.  Using information systems to measure and improve quality , 1999, Int. J. Medical Informatics.

[22]  Marcelo Fiszman,et al.  Semantic Interpretation for the Biomedical Research Literature , 2005 .

[23]  Alan R. Aronson,et al.  An overview of MetaMap: historical perspective and recent advances , 2010, J. Am. Medical Informatics Assoc..

[24]  Matthew Burns,et al.  Indication-Based Prescribing Improves Problem List Content and Medication Safety , 2012, AMIA.

[25]  Paul N. Gorman,et al.  Using medication list-problem list mismatches as markers of potential error , 2002, AMIA.

[26]  S. Trent Rosenbloom,et al.  VA National Drug File Reference Terminology: A Cross-Institutional Content Coverage Study , 2004, MedInfo.

[27]  Gunther Schadow,et al.  Structured Product Labeling Improves Detection of Drug-Intolerance Issues , 2008, AMIA.

[28]  Peter L. Elkin,et al.  Initializing the VA medication reference terminology using UMLS metathesaurus co-occurrences , 2002, AMIA.

[29]  A. Fayaz-Bakhsh,et al.  The impact of computerized physician order entry on medication error prevention , 2014, International Journal of Clinical Pharmacy.

[30]  Halil Kilicoglu,et al.  Constructing a semantic predication gold standard from the biomedical literature , 2011, BMC Bioinformatics.

[31]  Wendy W. Chapman,et al.  A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries , 2001, J. Biomed. Informatics.

[32]  D. Bates,et al.  Effect of computerized physician order entry and a team intervention on prevention of serious medication errors. , 1998, JAMA.

[33]  Christopher G. Chute,et al.  Letter: Further revamping VA's NDF-RT drug terminology for clinical research , 2011, J. Am. Medical Informatics Assoc..

[34]  J. Fleiss,et al.  Statistical methods for rates and proportions , 1973 .

[35]  Health information technology: initial set of standards, implementation specifications, and certification criteria for electronic health record technology. Final rule. , 2010, Federal register.

[36]  Carol Friedman,et al.  Semantic classification of biomedical concepts using distributional similarity. , 2007, Journal of the American Medical Informatics Association : JAMIA.

[37]  D. Bates,et al.  How can information technology improve patient safety and reduce medication errors in children's health care? , 2001, Archives of pediatrics & adolescent medicine.