Linking Genomic and Clinical Data for Discovery and Personalized Care

Electronic Health Records (EHRs) are a powerful tool to improve health care quality while reducing its costs. As a longitudinal repository of patient diagnoses, treatments, and responses to treatment, EHRs are also being increasingly recognized as an important tool for research as well as clinical care. By coupling EHRs with DNA biobanks, EHRs can also provide clinical phenotypes for genomic studies. This chapter summarizes the role and needed methods to use EHR data for genomic discovery and its implementation into clinical care. There are a number of challenges to accurate interpretation and repurposing of EHR data for clinical and genomic research. Typically, investigators employ multimodal “phenotype algorithms” to find accurate cases and controls. Such algorithms integrate billing codes, medication records, laboratory and test result data, and clinical notes to achieve necessary recall and precision. Since much of the content in the clinical record is in unstructured (narrative) clinical documentation, use of natural language processing is often required. Despite these challenges, researchers have been successful in replicating known genetic associations and making new discoveries using EHR data. Early demonstration projects show that the EHR, coupled with advanced genome-enabled decision support, may be the ideal repository to operationalize genomic data for clinical use.

[1]  M R Wilkinson,et al.  A Clinician‐Driven Automated System for Integration of Pharmacogenetic Interpretations Into an Electronic Medical Record , 2012, Clinical pharmacology and therapeutics.

[2]  Christopher G. Chute,et al.  A Genome-Wide Association Study of Red Blood Cell Traits Using the Electronic Medical Record , 2010, PloS one.

[3]  Spencer E. Harpe,et al.  Use of International Classification of Diseases, Ninth Revision Clinical Modification Codes and Medication Use Data to Identify Nosocomial Clostridium difficile Infection , 2009, Infection Control & Hospital Epidemiology.

[4]  Tianxi Cai,et al.  Validation of psoriatic arthritis diagnoses in electronic medical records using natural language processing. , 2011, Seminars in arthritis and rheumatism.

[5]  Carol Friedman,et al.  Natural Language and Text Processing in Biomedicine , 2006 .

[6]  Edward H. Shortliffe,et al.  Medical data: their acquisition, storage, and use , 1990 .

[7]  Hua Xu,et al.  Portability of an algorithm to identify rheumatoid arthritis in electronic health records , 2012, J. Am. Medical Informatics Assoc..

[8]  D. Roden,et al.  The Emerging Role of Electronic Medical Records in Pharmacogenomics , 2011, Clinical pharmacology and therapeutics.

[9]  C. Mackenzie,et al.  A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. , 1987, Journal of chronic diseases.

[10]  C. Pui,et al.  Thiopurine methyltransferase activity in American white subjects and black subjects , 1994, Clinical pharmacology and therapeutics.

[11]  Randolph A. Miller,et al.  Identifying UMLS concepts from ECG Impressions using Knowledge Map , 2005, AMIA.

[12]  Peter L. Elkin,et al.  A randomized controlled trial of the accuracy of clinical record retrieval using SNOMED-RT as compared with ICD9-CM , 2001, AMIA.

[13]  Melissa A. Basford,et al.  Variants near FOXE1 are associated with hypothyroidism and other thyroid conditions: using electronic medical records for genome- and phenome-wide studies. , 2011, American journal of human genetics.

[14]  Carol A. Keohane,et al.  Effect of bar-code technology on the safety of medication administration. , 2010, The New England journal of medicine.

[15]  K. Shojania,et al.  The effects of on-screen, point of care computer reminders on processes and outcomes of care. , 2009, The Cochrane database of systematic reviews.

[16]  Griffin M. Weber,et al.  Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2) , 2010, J. Am. Medical Informatics Assoc..

[17]  C. McCarty,et al.  Marshfield Clinic Personalized Medicine Research Project (PMRP): design, methods and recruitment for a large population-based biobank. , 2005, Personalized medicine.

[18]  Yasar A Ozcan,et al.  Do Hospitals With Electronic Medical Records (EMRs) Provide Higher Quality Care? , 2008, Medical care research and review : MCRR.

[19]  C Kooperberg,et al.  The use of phenome‐wide association studies (PheWAS) for exploration of novel genotype‐phenotype relationships and pleiotropy discovery , 2011, Genetic epidemiology.

[20]  Isaac S. Kohane,et al.  Technical desiderata for the integration of genomic data into Electronic Health Records , 2012, J. Biomed. Informatics.

[21]  John F. Hurdle,et al.  Automated identification of adverse events related to central venous catheters , 2007, J. Biomed. Informatics.

[22]  Clement J. McDonald,et al.  Development of the Logical Observation Identifier Names and Codes (LOINC) vocabulary. , 1998, Journal of the American Medical Informatics Association : JAMIA.

[23]  Wendy A. Wolf,et al.  The eMERGE Network: A consortium of biorepositories linked to electronic medical records data for conducting genomic studies , 2011, BMC Medical Genomics.

[24]  Betsy L. Humphreys,et al.  Technical Milestone: The Unified Medical Language System: An Informatics Research Collaboration , 1998, J. Am. Medical Informatics Assoc..

[25]  Timothy J Wilt,et al.  Transition to the new race/ethnicity data collection standards in the Department of Veterans Affairs , 2006, Population health metrics.

[26]  David Aron,et al.  Failure of ICD-9-CM codes to identify patients with comorbid chronic kidney disease in diabetes. , 2006, Health services research.

[27]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[28]  Hua Xu,et al.  Data from clinical notes: a perspective on the tension between structure and flexible documentation , 2011, J. Am. Medical Informatics Assoc..

[29]  D. Roden,et al.  Development of a Large‐Scale De‐Identified DNA Biobank to Enable Personalized Medicine , 2008, Clinical pharmacology and therapeutics.

[30]  Özlem Uzuner,et al.  Extracting medication information from clinical text , 2010, J. Am. Medical Informatics Assoc..

[31]  Carol Friedman,et al.  Semantic classification of biomedical concepts using distributional similarity. , 2007, Journal of the American Medical Informatics Association : JAMIA.

[32]  Melissa A. Basford,et al.  Identification of Genomic Predictors of Atrioventricular Conduction: Using Electronic Medical Records as a Tool for Genome Science , 2010, Circulation.

[33]  Lin Chen,et al.  Importance of multi-modal approaches to effectively identify cataract cases from electronic health records , 2012, J. Am. Medical Informatics Assoc..

[34]  T E Klein,et al.  Clinical Pharmacogenetics Implementation Consortium Guidelines for Thiopurine Methyltransferase Genotype and Thiopurine Dosing , 2011, Clinical pharmacology and therapy.

[35]  Joshua C. Denny,et al.  Chapter 13: Mining Electronic Health Records in the Genomics Era , 2012, PLoS Comput. Biol..

[36]  Son Doan,et al.  Application of information technology: MedEx: a medication information extraction system for clinical narratives , 2010, J. Am. Medical Informatics Assoc..

[37]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[38]  Carol Friedman,et al.  Facilitating Cancer Research using Natural Language Processing of Pathology Reports , 2004, MedInfo.

[39]  Sunghwan Sohn,et al.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications , 2010, J. Am. Medical Informatics Assoc..

[40]  Rainu Kaushal,et al.  Technology Evaluation: Return on Investment for a Computerized Physician Order Entry System , 2006, J. Am. Medical Informatics Assoc..

[41]  Anderson Spickard,et al.  Research Paper: "Understanding" Medical School Curriculum Content Using KnowledgeMap , 2003, J. Am. Medical Informatics Assoc..

[42]  R. Collins,et al.  SLCO1B1 variants and statin-induced myopathy--a genomewide study. , 2008, The New England journal of medicine.

[43]  Daniel J. Vreeman,et al.  Logical Observation Identifiers Names and Codes (LOINC®) users' guide , 2010 .

[44]  A. Hoerbst,et al.  Electronic Health Records , 2010, Methods of Information in Medicine.

[45]  ELSKE AMMENWERTH,et al.  Review Paper: The Effect of Electronic Prescribing on Medication Errors and Adverse Drug Events: A Systematic Review , 2008, J. Am. Medical Informatics Assoc..

[46]  George Hripcsak,et al.  Electronic Health Record Systems , 2014 .

[47]  Christopher G Chute,et al.  Analyzing the heterogeneity and complexity of Electronic Health Record oriented phenotyping algorithms. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[48]  R. Platt,et al.  Automated Identification of Acute Hepatitis B Using Electronic Medical Record Data to Facilitate Public Health Surveillance , 2008, PloS one.

[49]  Li Li,et al.  Comparing ICD9-Encoded Diagnoses and NLP-Processed Discharge Summaries for Clinical Trials Pre-Screening: A Case Study , 2008, AMIA.

[50]  D M Roden,et al.  Electronic Medical Records as a Tool in Clinical Pharmacology: Opportunities and Challenges , 2012, Clinical pharmacology and therapeutics.

[51]  I. Kohane,et al.  Electronic medical records for discovery research in rheumatoid arthritis , 2010, Arthritis care & research.

[52]  Monica Chiarini Tremblay,et al.  Data Mining and Knowledge Discovery on EHRs , 2009 .

[53]  Hua Xu,et al.  Extracting timing and status descriptors for colonoscopy testing from electronic medical records , 2010, J. Am. Medical Informatics Assoc..

[54]  Pedro Pereira Rodrigues,et al.  Data Quality and Integration Issues in Electronic Health Records , 2009 .

[55]  C. Chute,et al.  Electronic Medical Records for Genetic Research: Results of the eMERGE Consortium , 2011, Science Translational Medicine.

[56]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[57]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[58]  Elaine Lyon,et al.  The GeneInsight suite: a platform to support laboratory and provider use of DNA‐based genetic testing , 2011, Human mutation.

[59]  Han de Vries,et al.  Are electronic health records ready for genomic medicine? , 2009, Genetics in Medicine.

[60]  S. M. Huff,et al.  Research Paper: An Event Model of Medical Information Representation , 1995, J. Am. Medical Informatics Assoc..

[61]  Joshua C Denny,et al.  Generating Clinical Notes for Electronic Health Record Systems , 2010, Applied Clinical Informatics.

[62]  Catherine A. McCarty,et al.  Informed Consent and Subject Motivation to Participate in a Large, Population-Based Genomics Study: The Marshfield Clinic Personalized Medicine Research Project , 2006, Public Health Genomics.

[63]  D. Blumenthal,et al.  The "meaningful use" regulation for electronic health records. , 2010, The New England journal of medicine.

[64]  D. Roden,et al.  Predicting Clopidogrel Response Using DNA Samples Linked to an Electronic Health Record , 2012, Clinical pharmacology and therapeutics.

[65]  Thomas H. Payne,et al.  Review Paper: Medication-related Clinical Decision Support in Computerized Provider Order Entry Systems: A Review , 2007, J. Am. Medical Informatics Assoc..

[66]  Melissa A. Basford,et al.  Predicting warfarin dosage in European-Americans and African-Americans using DNA samples linked to an electronic health record. , 2012, Pharmacogenomics.

[67]  J. Denny,et al.  Naïve Electronic Health Record phenotype identification for Rheumatoid arthritis. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[68]  J. Staab,et al.  Pharmacogenomic testing and outcome among depressed patients in a tertiary care outpatient psychiatric consultation practice , 2011, Translational Psychiatry.

[69]  Joshua C Denny,et al.  Assessing the accuracy of observer-reported ancestry in a biorepository linked to electronic medical records , 2010, Genetics in Medicine.

[70]  Marylyn D. Ritchie,et al.  The use of a DNA biobank linked to electronic medical records to characterize pharmacogenomic predictors of tacrolimus dose requirement in kidney transplant recipients , 2012, Pharmacogenetics and genomics.

[71]  Nicholas Eriksson,et al.  Novel Associations for Hypothyroidism Include Known Autoimmune Risk Loci , 2011, PloS one.

[72]  E. Antman,et al.  Reduced-function CYP2C19 genotype and risk of adverse clinical outcomes among patients treated with clopidogrel predominantly for PCI: a meta-analysis. , 2010, JAMA.

[73]  Melissa A. Basford,et al.  Robust replication of genotype-phenotype associations across multiple diseases in an electronic medical record. , 2010, American journal of human genetics.

[74]  I. Kohane Using electronic health records to drive discovery in disease genomics , 2011, Nature Reviews Genetics.

[75]  Lawrence M. Fagan,et al.  Medical informatics: computer applications in health care and biomedicine (Health informatics) , 2003 .

[76]  C. Steiner,et al.  Comorbidity measures for use with administrative data. , 1998, Medical care.

[77]  B. Dean,et al.  Review: Use of Electronic Medical Records for Health Outcomes Research , 2009, Medical care research and review : MCRR.

[78]  B. Gage,et al.  Accuracy of ICD-9-CM Codes for Identifying Cardiovascular and Stroke Risk Factors , 2005, Medical care.

[79]  Cui Tao,et al.  Building a robust, scalable and standards-driven infrastructure for secondary use of EHR data: The SHARPn project , 2012, J. Biomed. Informatics.

[80]  Michael D Howell,et al.  Acid-suppressive medication use and the risk for hospital-acquired pneumonia. , 2009, JAMA.

[81]  Bruce E Bray,et al.  Efficiency of CYP2C9 Genetic Test Representation for Automated Pharmacogenetic Decision Support , 2009, Methods of Information in Medicine.

[82]  Stanley M. Huff,et al.  Standards for detailed clinical models as the basis for medical data exchange and decision support , 2003, Int. J. Medical Informatics.

[83]  Hua Xu,et al.  Facilitating pharmacogenetic studies using electronic health records and natural-language processing: a case study of warfarin , 2011, J. Am. Medical Informatics Assoc..

[84]  M. Cowen,et al.  Casemix adjustment of managed care claims data using the clinical classification for health policy research method. , 1998, Medical care.

[85]  John F. Hurdle,et al.  Identifying clinical/translational research cohorts: ascertainment via querying an integrated multi-source database , 2013, J. Am. Medical Informatics Assoc..

[86]  Christopher G. Chute,et al.  Mapping clinical phenotype data elements to standardized metadata repositories and controlled terminologies: the eMERGE Network experience , 2011, J. Am. Medical Informatics Assoc..

[87]  Suzette J. Bielinski,et al.  Use of diverse electronic medical record systems to identify genetic risk for type 2 diabetes within a genome-wide association study , 2012, J. Am. Medical Informatics Assoc..

[88]  Shuying Shen,et al.  Automated extraction of ejection fraction for quality measurement using regular expressions in Unstructured Information Management Architecture (UIMA) for heart failure , 2012, J. Am. Medical Informatics Assoc..

[89]  Sebastian Schneeweiss,et al.  Accuracy of Medicare claims-based diagnosis of acute myocardial infarction: estimating positive predictive value on the basis of review of hospital records. , 2004, American heart journal.

[90]  E. Clayton,et al.  Operational Implementation of Prospective Genotyping for Personalized Medicine: The Design of the Vanderbilt PREDICT Project , 2012, Clinical pharmacology and therapeutics.

[91]  George Hripcsak,et al.  Technical Brief: Agreement, the F-Measure, and Reliability in Information Retrieval , 2005, J. Am. Medical Informatics Assoc..

[92]  Marylyn D. Ritchie,et al.  PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene–disease associations , 2010, Bioinform..

[93]  P. Harris,et al.  Research electronic data capture (REDCap) - A metadata-driven methodology and workflow process for providing translational research informatics support , 2009, J. Biomed. Informatics.