Automated identification of molecular effects of drugs (AIMED)

INTRODUCTION Genomic profiling information is frequently available to oncologists, enabling targeted cancer therapy. Because clinically relevant information is rapidly emerging in the literature and elsewhere, there is a need for informatics technologies to support targeted therapies. To this end, we have developed a system for Automated Identification of Molecular Effects of Drugs, to help biomedical scientists curate this literature to facilitate decision support. OBJECTIVES To create an automated system to identify assertions in the literature concerning drugs targeting genes with therapeutic implications and characterize the challenges inherent in automating this process in rapidly evolving domains. METHODS We used subject-predicate-object triples (semantic predications) and co-occurrence relations generated by applying the SemRep Natural Language Processing system to MEDLINE abstracts and ClinicalTrials.gov descriptions. We applied customized semantic queries to find drugs targeting genes of interest. The results were manually reviewed by a team of experts. RESULTS Compared to a manually curated set of relationships, recall, precision, and F2 were 0.39, 0.21, and 0.33, respectively, which represents a 3- to 4-fold improvement over a publically available set of predications (SemMedDB) alone. Upon review of ostensibly false positive results, 26% were considered relevant additions to the reference set, and an additional 61% were considered to be relevant for review. Adding co-occurrence data improved results for drugs in early development, but not their better-established counterparts. CONCLUSIONS Precision medicine poses unique challenges for biomedical informatics systems that help domain experts find answers to their research questions. Further research is required to improve the performance of such systems, particularly for drugs in development.

[1]  Hyoil Han,et al.  Biomedical question answering: A survey , 2010, Comput. Methods Programs Biomed..

[2]  Funda Meric-Bernstam,et al.  The right drugs at the right time for the right patient: the MD Anderson precision oncology decision support platform. , 2015, Drug discovery today.

[3]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[4]  Ellen M. Voorhees,et al.  TREC genomics special issue overview , 2009, Information Retrieval.

[5]  Dejan Dinevski,et al.  Biomedical question answering using semantic relations , 2015, BMC Bioinformatics.

[6]  Johanna I. Westbrook,et al.  Do online information retrieval systems help experienced clinicians answer clinical questions? , 2005, Journal of the American Medical Informatics Association : JAMIA.

[7]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[8]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[9]  Ted Pedersen,et al.  Measures of semantic similarity and relatedness in the biomedical domain , 2007, J. Biomed. Informatics.

[10]  Jeremy J. Carroll,et al.  Named graphs, provenance and trust , 2005, WWW '05.

[11]  Charles Sneiderman,et al.  Research Paper: Knowledge-based Methods to Help Clinicians Find Answers in MEDLINE , 2007, J. Am. Medical Informatics Assoc..

[12]  Johanna I. Westbrook,et al.  Research Paper: Do clinicians use online evidence to support patient care? a study of 55, 000 clinicians , 2003, J. Am. Medical Informatics Assoc..

[13]  Elmer V. Bernstam,et al.  A day in the life of PubMed: analysis of a typical day's query log. , 2007, Journal of the American Medical Informatics Association : JAMIA.

[14]  Dina Demner-Fushman,et al.  Application of Information Technology: Essie: A Concept-based Search Engine for Structured Biomedical Text , 2007, J. Am. Medical Informatics Assoc..

[15]  Marti A. Hearst,et al.  TREC 2007 Genomics Track Overview , 2007, TREC.

[16]  Jimmy J. Lin,et al.  Answering Clinical Questions with Knowledge-Based and Statistical Techniques , 2007, CL.

[17]  Ellen M Voorhees Question answering in TREC , 2001, CIKM '01.

[18]  Lynette Hirschman,et al.  Natural language question answering: the view from here , 2001, Natural Language Engineering.

[19]  Marcelo Fiszman,et al.  The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text , 2003, J. Biomed. Informatics.

[20]  Levi A Garraway,et al.  Precision oncology: an overview. , 2013, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[21]  Amit P. Sheth,et al.  Don't like RDF reification?: making statements about statements using singleton property , 2014, WWW.

[22]  K. A. McKibbon,et al.  Online access to MEDLINE in clinical settings. A study of use and usefulness. , 1990, Annals of internal medicine.

[23]  Maureen Stolzer,et al.  Event inference in multidomain families with phylogenetic reconciliation , 2015, BMC Bioinformatics.

[24]  Hailong Zhu,et al.  Predicting protein functions using incomplete hierarchical labels , 2015, BMC Bioinformatics.

[25]  W. Hersh,et al.  Factors associated with successful answering of clinical questions using an information retrieval system. , 2002, Bulletin of the Medical Library Association.

[26]  Olivier Bodenreider Provenance Information in Biomedical Knowledge Repositories - A Use Case , 2009, SWPM.

[27]  Halil Kilicoglu,et al.  SemMedDB: a PubMed-scale repository of biomedical semantic predications , 2012, Bioinform..

[28]  Joshua F. McMichael,et al.  DGIdb - Mining the druggable genome , 2013, Nature Methods.

[29]  Sanda M. Harabagiu,et al.  Performance Issues and Error Analysis in an Open-Domain Question Answering System , 2002, ACL.

[30]  Funda Meric-Bernstam,et al.  Building a personalized medicine infrastructure at a major cancer center. , 2013, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[31]  Rodney D. Nielsen,et al.  The MiPACQ clinical question answering system. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[32]  Fabio Rinaldi,et al.  Answering Questions in the Genomics Domain , 2004, ACL 2004.