Annotation analysis for testing drug safety signals using unstructured clinical notes

BackgroundThe electronic surveillance for adverse drug events is largely based upon the analysis of coded data from reporting systems. Yet, the vast majority of electronic health data lies embedded within the free text of clinical notes and is not gathered into centralized repositories. With the increasing access to large volumes of electronic medical data—in particular the clinical notes—it may be possible to computationally encode and to test drug safety signals in an active manner.ResultsWe describe the application of simple annotation tools on clinical text and the mining of the resulting annotations to compute the risk of getting a myocardial infarction for patients with rheumatoid arthritis that take Vioxx. Our analysis clearly reveals elevated risks for myocardial infarction in rheumatoid arthritis patients taking Vioxx (odds ratio 2.06) before 2005.ConclusionsOur results show that it is possible to apply annotation analysis methods for testing hypotheses about drug safety using electronic medical records.

[1]  George Hripcsak,et al.  A statistical methodology for analyzing co-occurrence data from a large sample , 2007, J. Biomed. Informatics.

[2]  Mark A. Musen,et al.  Enabling enrichment analysis with the Human Disease Ontology , 2011, J. Biomed. Informatics.

[3]  Stephanie Chung,et al.  The FDA drug safety surveillance program: adverse event reporting trends. , 2011, Archives of internal medicine.

[4]  J Starren,et al.  Architectural requirements for a multipurpose natural language processor in the clinical environment. , 1995, Proceedings. Symposium on Computer Applications in Medical Care.

[5]  S. Shoor,et al.  Risk of acute myocardial infarction and sudden cardiac death in patients treated with cyclo-oxygenase 2 selective and non-selective non-steroidal anti-inflammatory drugs: nested case-control study , 2005, The Lancet.

[6]  Siddhartha R. Dalal,et al.  Using information mining of the medical literature to improve drug safety , 2011, J. Am. Medical Informatics Assoc..

[7]  Daniel L. Rubin,et al.  Comparison of concept recognizers for building the Open Biomedical Annotator , 2009, BMC Bioinformatics.

[8]  Rong Chen,et al.  Ontology-driven indexing of public datasets for translational bioinformatics , 2009, BMC Bioinformatics.

[9]  Christopher G. Chute,et al.  BioPortal: ontologies and integrated data resources at the click of a mouse , 2009, Nucleic Acids Res..

[10]  Sunghwan Sohn,et al.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications , 2010, J. Am. Medical Informatics Assoc..

[11]  Nigam H. Shah,et al.  Using Temporal Patterns in Medical Records to Discern Adverse Drug Events from Indications , 2012, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[12]  M. Schuemie,et al.  Combining electronic healthcare databases in Europe to allow for large‐scale drug safety monitoring: the EU‐ADR Project , 2011, Pharmacoepidemiology and drug safety.

[13]  Carol Friedman,et al.  Mining multi-item drug adverse effect associations in spontaneous reporting systems , 2010, BMC Bioinformatics.

[14]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[15]  G Hripcsak,et al.  Biclustering of Adverse Drug Events in the FDA's Spontaneous Reporting System , 2011, Clinical pharmacology and therapeutics.

[16]  Carol Friedman,et al.  Mining electronic health records for adverse drug effects using regression based methods , 2010, IHI.

[17]  D. Bates,et al.  The Costs of Adverse Drug Events in Hospitalized Patients , 1997 .

[18]  J. Avorn,et al.  A review of uses of health care utilization databases for epidemiologic research on therapeutics. , 2005, Journal of clinical epidemiology.

[19]  J. Avorn,et al.  High-dimensional Propensity Score Adjustment in Studies of Treatment Effects Using Health Care Claims Data , 2009, Epidemiology.

[20]  Xiaoyan Wang,et al.  Active computerized pharmacovigilance using natural language processing, statistics, and electronic health records: a feasibility study. , 2009, Journal of the American Medical Informatics Association : JAMIA.

[21]  D. Dore,et al.  Use of a claims-based active drug safety surveillance system to assess the risk of acute pancreatitis with exenatide or sitagliptin compared to metformin or glyburide. , 2009, Current medical research and opinion.

[22]  J. Overhage,et al.  Advancing the Science for Active Surveillance: Rationale and Design for the Observational Medical Outcomes Partnership , 2010, Annals of Internal Medicine.

[23]  Prakash M. Nadkarni,et al.  Drug safety surveillance using de-identified EMR and claims data: issues and challenges , 2010, J. Am. Medical Informatics Assoc..

[24]  Purvesh Khatri,et al.  Ontological analysis of gene expression data: current tools, limitations, and open problems , 2005, Bioinform..

[25]  Chitta Baral,et al.  Discovering drug–drug interactions: a text-mining and reasoning approach based on properties of drug metabolism , 2010, Bioinform..

[26]  Lucila Ohno-Machado,et al.  Realizing the full potential of electronic health records: the role of natural language processing , 2011, J. Am. Medical Informatics Assoc..

[27]  Prakash M. Nadkarni,et al.  Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions , 2011, J. Am. Medical Informatics Assoc..

[28]  Gil Alterovitz,et al.  GO PaD: the Gene Ontology Partition Database , 2006, Nucleic Acids Res..

[29]  Wendy W. Chapman,et al.  A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries , 2001, J. Biomed. Informatics.

[30]  M J Goldacre,et al.  Cancer and immune-mediated disease in people who have had meningococcal disease: record-linkage studies , 2008, Epidemiology and Infection.

[31]  Patrick B. Ryan,et al.  Development and evaluation of a common data model enabling active drug safety surveillance using disparate healthcare databases , 2010, J. Am. Medical Informatics Assoc..

[32]  Dean F. Sittig,et al.  Application of electronic health records to the Joint Commission's 2011 National Patient Safety Goals. , 2011, JAMA.

[33]  Wendy W. Chapman,et al.  ConText: An Algorithm for Identifying Contextual Features from Clinical Text , 2007, BioNLP@ACL.

[34]  S D Small,et al.  The costs of adverse drug events in hospitalized patients. Adverse Drug Events Prevention Study Group. , 1998, JAMA.

[35]  Neil R. Smalheiser,et al.  Proceedings of the 1st ACM International Health Informatics Symposium , 2010, IHI 2010.

[36]  D. Classen,et al.  'Global trigger tool' shows that adverse events in hospitals may be ten times greater than previously measured. , 2011, Health affairs.

[37]  A. Bate,et al.  Quantitative signal detection using spontaneous ADR reporting , 2009, Pharmacoepidemiology and drug safety.

[38]  Sébastien Paumier,et al.  De la reconnaissance des formes linguistiques à l'analyse syntaxique , 2003 .

[39]  Madeleine Udell,et al.  Analyzing Patterns of Drug Use in Clinical Notes for Patient Safety , 2012, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.