Toward Creating a Gold Standard of Drug Indications from FDA Drug Labels

Having quick access to trustworthy drug-disease relationships (which drug(s) are approved for treating or preventing which disease(s)) is one of the top information needs of health providers, consumers, and researchers. This paper presents a semi-automatic approach that can lead to the creation of a gold standard of drugs and their indications. As our system input, we use the Daily Med, which houses the most current drug labels submitted to FDA by pharmaceutical companies. Extraction of specific indications from FDA labels is a challenging problem that requires distinguishing indications from other disease mentions. In response, we first identify the candidate indications from drug labels using UMLS resources and BioNLP tools, and then rely on expert judgments to validate those pre-computed indications through an interactive Web interface. For preliminary analysis, we recruited two experts to manually annotate 100 labels of frequently sought human prescription drugs at PubMed Health. We find that the resultant expert-curated gold standard on drugs and their indications is high-quality (precision=97%, recall=94%), and differs from existing resources in that it is factual, structured, and dose-form specific. The study findings suggest the feasibility of the proposed method toward building a comprehensive resource of drug indications.

[1]  George Hripcsak,et al.  Automated acquisition of disease drug knowledge from biomedical and clinical documents: an initial study. , 2008, Journal of the American Medical Informatics Association : JAMIA.

[2]  Zhiyong Lu,et al.  An improved corpus of disease mentions in PubMed citations , 2012, BioNLP@HLT-NAACL.

[3]  Jiao Li,et al.  Improving Online Access to Drug-Related Information , 2012, 2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology.

[4]  Zhiyong Lu,et al.  Understanding PubMed® user search behavior through log analysis , 2009, Database J. Biol. Databases Curation.

[5]  R. Nandakumar,et al.  Automatic Integration of Drug Indications from Multiple Health Resources , 2015 .

[6]  Keith Marsolo,et al.  Building Gold Standard Corpora for Medical Natural Language Processing Tasks , 2012, AMIA.

[7]  Ritu Khare,et al.  Understanding the EMR error control practices among gynecologic physicians , 2013 .

[8]  Zhiyong Lu,et al.  COMPUTATIONAL DRUG REPOSITIONING , 2012 .

[9]  Zhiyong Lu,et al.  A new method for computational drug repositioning using drug pairwise similarity , 2012, 2012 IEEE International Conference on Bioinformatics and Biomedicine.

[10]  P. Gorman,et al.  A taxonomy of generic clinical questions: classification study , 2000, BMJ : British Medical Journal.

[11]  David S. Wishart,et al.  DrugBank 3.0: a comprehensive resource for ‘Omics’ research on drugs , 2010, Nucleic Acids Res..

[12]  Zhiyong Lu,et al.  Semi-automatic semantic annotation of PubMed queries: A study on quality, efficiency, satisfaction , 2011, J. Biomed. Informatics.

[13]  Michel Gagnon,et al.  Drugs and Disorders : From specialized resources to Web data , 2011 .

[14]  Jacob de Vlieg,et al.  Literature Mining for the Discovery of Hidden Connections between Drugs, Genes and Diseases , 2010, PLoS Comput. Biol..