eMERGE Phenome-Wide Association Study (PheWAS) identifies clinical associations and pleiotropy for stop-gain variants

BackgroundWe explored premature stop-gain variants to test the hypothesis that variants, which are likely to have a consequence on protein structure and function, will reveal important insights with respect to the phenotypes associated with them. We performed a phenome-wide association study (PheWAS) exploring the association between a selected list of functional stop-gain genetic variants (variation resulting in truncated proteins or in nonsense-mediated decay) and an extensive group of diagnoses to identify novel associations and uncover potential pleiotropy.ResultsIn this study, we selected 25 stop-gain variants: 5 stop-gain variants with previously reported phenotypic associations, and a set of 20 putative stop-gain variants identified using dbSNP. For the PheWAS, we used data from the electronic MEdical Records and GEnomics (eMERGE) Network across 9 sites with a total of 41,057 unrelated patients. We divided all these samples into two datasets by equal proportion of eMERGE site, sex, race, and genotyping platform. We calculated single effect associations between these 25 stop-gain variants and ICD-9 defined case-control diagnoses. We also performed stratified analyses for samples of European and African ancestry. Associations were adjusted for sex, site, genotyping platform and the first three principal components to account for global ancestry. We identified previously known associations, such as variants in LPL associated with hyperglyceridemia indicating that our approach was robust. We also found a total of three significant associations with p < 0.01 in both datasets, with the most significant replicating result being LPL SNP rs328 and ICD-9 code 272.1 “Disorder of Lipoid metabolism” (pdiscovery = 2.59x10-6, preplicating = 2.7x10-4). The other two significant replicated associations identified by this study are: variant rs1137617 in KCNH2 gene associated with ICD-9 code category 244 “Acquired Hypothyroidism” (pdiscovery = 5.31x103, preplicating = 1.15x10-3) and variant rs12060879 in DPT gene associated with ICD-9 code category 996 “Complications peculiar to certain specified procedures” (pdiscovery = 8.65x103, preplicating = 4.16x10-3). ConclusionIn conclusion, this PheWAS revealed novel associations of stop-gained variants with interesting phenotypes (ICD-9 codes) along with pleiotropic effects.

[1]  Daniel Rios,et al.  Bioinformatics Applications Note Databases and Ontologies Deriving the Consequences of Genomic Variants with the Ensembl Api and Snp Effect Predictor , 2022 .

[2]  P. Donnelly,et al.  A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies , 2009, PLoS genetics.

[3]  D. Altshuler,et al.  Type 2 Diabetes–Associated Missense Polymorphisms KCNJ11 E23K and ABCC8 A1369S Influence Progression to Diabetes and Response to Interventions in the Diabetes Prevention Program , 2007, Diabetes.

[4]  D. Moyes,et al.  The mycobiome: influencing IBD severity. , 2012, Cell host & microbe.

[5]  Mark Gerstein,et al.  VAT: a computational framework to functionally annotate variants in personal genomes within a cloud-computing environment , 2012, Bioinform..

[6]  A. Butte,et al.  Non-Synonymous and Synonymous Coding SNPs Show Similar Likelihood and Effect Size of Human Disease Association , 2010, PloS one.

[7]  Marylyn D. Ritchie,et al.  Detection of Pleiotropy through a Phenome-Wide Association Study (PheWAS) of Epidemiologic Data as Part of the Environmental Architecture for Genes Linked to Environment (EAGLE) Study , 2014, PLoS genetics.

[8]  S. Hebbring The challenges, advantages and future of phenome-wide association studies , 2014, Immunology.

[9]  Marylyn D. Ritchie,et al.  Synthesis-View: visualization and interpretation of SNP association results for multi-cohort, multi-phenotype data and meta-analysis , 2010, BioData Mining.

[10]  G. Abecasis,et al.  A Genome-Wide Association Study of Type 2 Diabetes in Finns Detects Multiple Susceptibility Variants , 2007, Science.

[11]  Yusuke Nakamura,et al.  Genome-Wide Association Study of White Blood Cell Count in 16,388 African Americans: the Continental Origins and Genetic Epidemiology Network (COGENT) , 2011, PLoS genetics.

[12]  Zhan Ye,et al.  Phenome-wide association studies (PheWASs) for functional variants , 2014, European Journal of Human Genetics.

[13]  M. McCarthy,et al.  Adiposity-Related Heterogeneity in Patterns of Type 2 Diabetes Susceptibility Observed in Genome-Wide Association Data , 2009, Diabetes.

[14]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[15]  Allan Jacobson,et al.  NMD: a multifaceted response to premature translational termination , 2012, Nature Reviews Molecular Cell Biology.

[16]  P. Stenson,et al.  The Human Gene Mutation Database: 2008 update , 2009, Genome Medicine.

[17]  S. Scott,et al.  Personalizing medicine with clinical pharmacogenetics , 2011, Genetics in Medicine.

[18]  Melissa A. Basford,et al.  The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future , 2013, Genetics in Medicine.

[19]  R. Grant,et al.  Loss-of-function CYP2C9 variants: finding the correct clinical role for Type 2 diabetes pharmacogenetic testing , 2010, Expert review of cardiovascular therapy.

[20]  Y. Takeshima,et al.  Nonsense mutation of the alpha-actinin-3 gene is not associated with dystrophinopathy. , 2000, American journal of medical genetics.

[21]  H. Hakonarson,et al.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data , 2010, Nucleic acids research.

[22]  Peter D. Tonner,et al.  Detecting transcription of ribosomal protein pseudogenes in diverse human tissues from RNA-seq data , 2012, BMC Genomics.

[23]  O. Bakiner,et al.  Subclinical Hypothyroidism Is Characterized by Increased QT Interval Dispersion among Women , 2008, Medical Principles and Practice.

[24]  Marylyn D. Ritchie,et al.  Phenome-Wide Association Study (PheWAS) for Detection of Pleiotropy within the Population Architecture using Genomics and Epidemiology (PAGE) Network , 2013, PLoS genetics.

[25]  S. Chanock,et al.  Common variants of FUT2 are associated with plasma vitamin B12 levels , 2008, Nature Genetics.

[26]  Wendy A. Wolf,et al.  The eMERGE Network: A consortium of biorepositories linked to electronic medical records data for conducting genomic studies , 2011, BMC Medical Genomics.

[27]  Graeme I. Bell,et al.  Diabetes mellitus and genetically programmed defects in β-cell function , 2001, Nature.

[28]  Marylyn D. Ritchie,et al.  Imputation and quality control steps for combining multiple genome-wide datasets , 2014, Front. Genet..

[29]  Life Technologies,et al.  A map of human genome variation from population-scale sequencing , 2011 .

[30]  Dror Berel,et al.  Fucosyltransferase 2 (FUT2) non-secretor status is associated with Crohn's disease. , 2010, Human molecular genetics.

[31]  Xuanchun Wang,et al.  An analysis of the association between a polymorphism of KCNJ11 and diabetic retinopathy in a Chinese Han population , 2015, European Journal of Medical Research.

[32]  T. Frayling,et al.  The association of common genetic variants in the APOA5, LPL and GCK genes with longitudinal changes in metabolic and cardiovascular traits , 2008, Diabetologia.

[33]  Jean-Baptiste Cazier,et al.  Choice of transcripts and software has a large effect on variant annotation , 2014, Genome Medicine.

[34]  Emily H Turner,et al.  Targeted Capture and Massively Parallel Sequencing of Twelve Human Exomes , 2009, Nature.

[35]  Melissa A. Basford,et al.  Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data , 2013, Nature Biotechnology.

[36]  Marylyn D. Ritchie,et al.  PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene–disease associations , 2010, Bioinform..

[37]  Marcia M. Nizzari,et al.  Genome-Wide Association Analysis Identifies Loci for Type 2 Diabetes and Triglyceride Levels , 2007, Science.

[38]  Ioannis Xenarios,et al.  Analysis of Stop-Gain and Frameshift Variants in Human Innate Immunity Genes , 2014, bioRxiv.

[39]  D. Denning,et al.  Polymorphisms in toll-like receptor genes and susceptibility to pulmonary aspergillosis. , 2008, The Journal of infectious diseases.

[40]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[41]  Aaron R. Quinlan,et al.  GEMINI: Integrative Exploration of Genetic Variation and Genome Annotations , 2013, PLoS Comput. Biol..

[42]  Keith Marsolo,et al.  Phenome-wide association study (PheWAS) in EMR-linked pediatric cohorts, genetically links PLCL1 to speech language development and IL5-IL13 to Eosinophilic Esophagitis , 2014, Front. Genet..

[43]  J. Marchini,et al.  Fast and accurate genotype imputation in genome-wide association studies through pre-phasing , 2012, Nature Genetics.

[44]  Muin J Khoury,et al.  The emergence of epidemiology in the genomics age. , 2004, International journal of epidemiology.

[45]  David Levine,et al.  A high-performance computing toolset for relatedness and principal component analysis of SNP data , 2012, Bioinform..

[46]  M. Simmonds,et al.  The HLA Region and Autoimmune Disease: Associations and Mechanisms of Action , 2007, Current genomics.

[47]  Alessandro Antonelli,et al.  Changes in heart rate variability and QT dispersion in patients with overt hypothyroidism. , 2008, European journal of endocrinology.

[48]  D. Firth Bias reduction of maximum likelihood estimates , 1993 .

[49]  G I Bell,et al.  Diabetes mellitus and genetically programmed defects in beta-cell function. , 2001, Nature.

[50]  Peggy Hall,et al.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations , 2013, Nucleic Acids Res..

[51]  Trees Jansen,et al.  Human dectin-1 deficiency and mucocutaneous fungal infections. , 2009, The New England journal of medicine.

[52]  S. Purcell,et al.  Pleiotropy in complex traits: challenges and strategies , 2013, Nature Reviews Genetics.

[53]  Pablo Cingolani,et al.  © 2012 Landes Bioscience. Do not distribute. , 2022 .