PheWAS and Beyond: The Landscape of Associations with Medical Diagnoses and Clinical Measures across 38,662 Individuals from Geisinger.

Most phenome-wide association studies (PheWASs) to date have used a small to moderate number of SNPs for association with phenotypic data. We performed a large-scale single-cohort PheWAS, using electronic health record (EHR)-derived case-control status for 541 diagnoses using International Classification of Disease version 9 (ICD-9) codes and 25 median clinical laboratory measures. We calculated associations between these diagnoses and traits with ∼630,000 common frequency SNPs with minor allele frequency > 0.01 for 38,662 individuals. In this landscape PheWAS, we explored results within diseases and traits, comparing results to those previously reported in genome-wide association studies (GWASs), as well as previously published PheWASs. We further leveraged the context of functional impact from protein-coding to regulatory regions, providing a deeper interpretation of these associations. The comprehensive nature of this PheWAS allows for novel hypothesis generation, the identification of phenotypes for further study for future phenotypic algorithm development, and identification of cross-phenotype associations.

[1]  Helen E. Parkinson,et al.  The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog) , 2016, Nucleic Acids Res..

[2]  Ross M. Fraser,et al.  Genetic studies of body mass index yield new insights for obesity biology , 2015, Nature.

[3]  Toshiko Tanaka,et al.  Novel association to the proprotein convertase PCSK7 gene locus revealed by analysing soluble transferrin receptor (sTfR) levels. , 2011, Human molecular genetics.

[4]  Ming-Huei Chen,et al.  Genome-wide association meta-analysis for total serum bilirubin levels. , 2009, Human molecular genetics.

[5]  M. Skolnick,et al.  Genomic DNA pooling for whole-genome association scans in complex disease: empirical demonstration of efficacy in rheumatoid arthritis , 2007, Genes and Immunity.

[6]  M. Pirinen,et al.  Genome-wide study for circulating metabolites identifies 62 loci and reveals novel systemic effects of LPA , 2016, Nature Communications.

[7]  Keith Marsolo,et al.  Phenome-wide association study (PheWAS) in EMR-linked pediatric cohorts, genetically links PLCL1 to speech language development and IL5-IL13 to Eosinophilic Esophagitis , 2014, Front. Genet..

[8]  O. Khorkova,et al.  Regulation of the apolipoprotein gene cluster by a long noncoding RNA. , 2014, Cell reports.

[9]  Marylyn D. Ritchie,et al.  Identifying Genetic Associations with Variability in Metabolic Health and Blood Count Laboratory Values: Diving into the Quantitative Traits by Leveraging Longitudinal Data from an EHR , 2017, PSB.

[10]  Gerard Tromp,et al.  Phenome-Wide Association Study to Explore Relationships between Immune System Related Genetic Loci and Complex Traits and Diseases , 2016, PloS one.

[11]  Marylyn D. Ritchie,et al.  Integrating Clinical Laboratory Measures and ICD-9 Code Diagnoses in Phenome-Wide Association Studies , 2016, PSB.

[12]  Katsushi Tokunaga,et al.  Phenome-wide association study maps new diseases to the human major histocompatibility complex region , 2016, Journal of Medical Genetics.

[13]  Christian Gieger,et al.  Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture , 2013, Nature Genetics.

[14]  Y. Kamatani,et al.  Common variations in PSMD3-CSF3 and PLCB4 are associated with neutrophil count. , 2010, Human molecular genetics.

[15]  T. Mikkelsen,et al.  The NIH Roadmap Epigenomics Mapping Consortium , 2010, Nature Biotechnology.

[16]  T. Lehtimäki,et al.  Evidence HDAC9 Genetic Variant Associated With Ischemic Stroke Increases Risk via Promoting Carotid Atherosclerosis , 2013, Stroke.

[17]  Christian Gieger,et al.  Multiple loci influence erythrocyte phenotypes in the CHARGE Consortium , 2009, Nature Genetics.

[18]  Wendy A. Wolf,et al.  The eMERGE Network: A consortium of biorepositories linked to electronic medical records data for conducting genomic studies , 2011, BMC Medical Genomics.

[19]  Andrew D. Johnson,et al.  Genome-wide association study of kidney function decline in individuals of European descent , 2014, Kidney international.

[20]  Jing He,et al.  Trans-ancestry genome-wide association study identifies 12 genetic loci influencing blood pressure and implicates a role for DNA methylation , 2015, Nature Genetics.

[21]  Wei Zheng,et al.  Addressing Population‐Specific Multiple Testing Burdens in Genetic Association Studies , 2015, Annals of human genetics.

[22]  Andrew D. Johnson,et al.  Genome-wide Trans-ethnic Meta-analysis Identifies Seven Genetic Loci Influencing Erythrocyte Traits and a Role for RBPMS in Erythropoiesis. , 2017, American journal of human genetics.

[23]  Pablo Moscato,et al.  Genome-wide association study identifies new multiple sclerosis susceptibility loci on chromosomes 12 and 20 , 2009, Nature Genetics.

[24]  Marylyn D. Ritchie,et al.  PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene–disease associations , 2010, Bioinform..

[25]  D. Bayliss,et al.  KCNK3 Variants Are Associated With Hyperaldosteronism and Hypertension , 2016, Hypertension.

[26]  Yusuke Nakamura,et al.  Genome-wide association study of hematological and biochemical traits in a Japanese population , 2010, Nature Genetics.

[27]  Yiran Guo,et al.  GWAS of blood cell traits identifies novel associated loci and epistatic interactions in Caucasian and African-American children. , 2013, Human molecular genetics.

[28]  Yu Zhang,et al.  Jointly characterizing epigenetic dynamics across multiple human cell types , 2016, Nucleic acids research.

[29]  Marylyn D. Ritchie,et al.  Phenome-Wide Association Study (PheWAS) for Detection of Pleiotropy within the Population Architecture using Genomics and Epidemiology (PAGE) Network , 2013, PLoS genetics.

[30]  D. Ledbetter,et al.  The Geisinger MyCode Community Health Initiative: an electronic health record-linked biobank for Precision Medicine research , 2015, Genetics in Medicine.

[31]  Christian Gieger,et al.  Multiple Loci Are Associated with White Blood Cell Phenotypes , 2011, PLoS genetics.

[32]  Udo Hoffmann,et al.  Genome-Wide Association Analysis Identifies Variants Associated with Nonalcoholic Fatty Liver Disease That Have Distinct Effects on Metabolic Traits , 2011, PLoS genetics.

[33]  Tanya M. Teslovich,et al.  An Expanded Genome-Wide Association Study of Type 2 Diabetes in Europeans , 2017, Diabetes.

[34]  G. Lippi,et al.  Red blood cell distribution width and cardiovascular diseases. , 2015, Journal of thoracic disease.

[35]  Tom R. Gaunt,et al.  Edinburgh Research Explorer Genetic associations at 53 loci highlight cell types and biological pathways relevant for kidney function , 2022 .

[36]  Anurag Verma,et al.  PLATO software provides analytic framework for investigating complexity beyond genome-wide association studies , 2017, Nature Communications.

[37]  Suzette J. Bielinski,et al.  eMERGE Phenome-Wide Association Study (PheWAS) identifies clinical associations and pleiotropy for stop-gain variants , 2016, BMC Medical Genomics.

[38]  J Duffin,et al.  The cerebrovascular response to carbon dioxide in humans , 2011, The Journal of physiology.

[39]  M. Dichgans,et al.  Deficiency of the Stroke Relevant HDAC9 Gene Attenuates Atherosclerosis in Accord With Allele-Specific Effects at 7p21.1 , 2015, Stroke.

[40]  Manolis Kellis,et al.  HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants , 2011, Nucleic Acids Res..

[41]  Gonçalo Abecasis,et al.  Genome-wide association study identifies variants in TMPRSS6 associated with hemoglobin levels , 2009, Nature Genetics.

[42]  Huan Mo,et al.  Phenome‐Wide Association Study of Rheumatoid Arthritis Subgroups Identifies Association Between Seronegative Disease and Fibromyalgia , 2017, Arthritis & rheumatology.

[43]  Richard Leslie,et al.  GRASP: analysis of genotype-phenotype results from 1390 genome-wide association studies and corresponding open access database , 2014, Bioinform..

[44]  Fabian J Theis,et al.  Genome-wide association analyses identify 18 new loci associated with serum urate concentrations , 2012, Nature Genetics.

[45]  C. Baldwin,et al.  CORTICOTROPIN RELEASING HORMONE RECEPTOR 2 (CRHR‐2) GENE IS ASSOCIATED WITH DECREASED RISK AND SEVERITY OF POSTTRAUMATIC STRESS DISORDER IN WOMEN , 2013, Depression and anxiety.

[46]  Michael Q. Zhang,et al.  Integrative analysis of 111 reference human epigenomes , 2015, Nature.

[47]  Uwe Völker,et al.  New loci associated with kidney function and chronic kidney disease , 2010, Nature Genetics.

[48]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[49]  F. Cambien,et al.  Genetics of Venous Thrombosis: Insights from a New Genome Wide Association Study , 2011, PloS one.

[50]  Hanne F. Harbo,et al.  Oligoclonal Band Status in Scandinavian Multiple Sclerosis Patients Is Associated with Specific Genetic Risk Alleles , 2013, PloS one.

[51]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[52]  Marylyn D. Ritchie,et al.  Detection of Pleiotropy through a Phenome-Wide Association Study (PheWAS) of Epidemiologic Data as Part of the Environmental Architecture for Genes Linked to Environment (EAGLE) Study , 2014, PLoS genetics.

[53]  Simon Lin,et al.  Application of clinical text data for phenome-wide association studies (PheWASs) , 2015, Bioinform..

[54]  Christian Gieger,et al.  Seventy-five genetic loci influencing the human red blood cell , 2012, Nature.

[55]  Gabriëlle H S Buitendijk,et al.  Insights into the Genetic Architecture of Early Stage Age-Related Macular Degeneration: A Genome-Wide Association Study Meta-Analysis , 2013, PloS one.

[56]  Inês Barroso,et al.  Genome-wide SNP and CNV analysis identifies common and low-frequency variants associated with severe early-onset obesity , 2013, Nature Genetics.

[57]  Melissa A. Basford,et al.  Variants near FOXE1 are associated with hypothyroidism and other thyroid conditions: using electronic medical records for genome- and phenome-wide studies. , 2011, American journal of human genetics.

[58]  Joseph T. Glessner,et al.  A genome-wide association study identifies KIAA0350 as a type 1 diabetes gene , 2007, Nature.

[59]  Steven J. Schrodi,et al.  A PheWAS approach in studying HLA-DRB1*1501 , 2013, Genes and Immunity.

[60]  F. Cunningham,et al.  The Ensembl Variant Effect Predictor , 2016, Genome Biology.

[61]  Zhan Ye,et al.  Phenome-wide association studies (PheWASs) for functional variants , 2014, European Journal of Human Genetics.

[62]  P. Gregersen,et al.  A genetic variant in the region of MMP-9 is associated with serum levels and progression of joint damage in rheumatoid arthritis , 2013, Annals of the rheumatic diseases.

[63]  I. Flisiak,et al.  Effect of psoriasis activity on epidermal growth factor (EGF) and the concentration of soluble EGF receptor in serum and plaque scales , 2014, Clinical and experimental dermatology.

[64]  Yu Zhang,et al.  Accurate and reproducible functional maps in 127 human cell types via 2D genome segmentation , 2017, Nucleic acids research.

[65]  K. Mohlke,et al.  Genetic association with lipids in Filipinos: waist circumference modifies an APOA5 effect on triglyceride levels[S] , 2013, Journal of Lipid Research.

[66]  Han Liu,et al.  The MIT Domain of UBPY Constitutes a CHMP Binding and Endosomal Localization Signal Required for Efficient Epidermal Growth Factor Receptor Degradation* , 2007, Journal of Biological Chemistry.

[67]  Scott F. Saccone,et al.  A Genome-Wide Association Study of Psoriasis and Psoriatic Arthritis Identifies New Disease Loci , 2008, PLoS genetics.

[68]  Wenjie Chen,et al.  GRASP v2.0: an update on the Genome-Wide Repository of Associations between SNPs and phenotypes , 2014, Nucleic Acids Res..

[69]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[70]  Caitlin P. McHugh,et al.  Genome-wide association study of red blood cell traits in Hispanics/Latinos: The Hispanic Community Health Study/Study of Latinos , 2017, PLoS genetics.

[71]  R. Trembath,et al.  Family-based analysis using a dense single-nucleotide polymorphism-based map defines genetic variation at PSORS1, the major psoriasis-susceptibility locus. , 2002, American journal of human genetics.

[72]  Vincent Plagnol,et al.  Meta-analysis of genome-wide association study data identifies additional type 1 diabetes risk loci , 2008, Nature Genetics.

[73]  F. Holsboer,et al.  Evidence for VAV2 and ZNF433 as susceptibility genes for multiple sclerosis , 2010, Journal of Neuroimmunology.

[74]  Jing Cui,et al.  Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci , 2010, Nature Genetics.