Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics

Scalable, integrative methods to understand mechanisms that link genetic variants with phenotypes are needed. Here we derive a mathematical expression to compute PrediXcan (a gene mapping approach) results using summary data (S-PrediXcan) and show its accuracy and general robustness to misspecified reference sets. We apply this framework to 44 GTEx tissues and 100+ phenotypes from GWAS and meta-analysis studies, creating a growing public catalog of associations that seeks to capture the effects of gene expression variation on human phenotypes. Replication in an independent cohort is shown. Most of the associations are tissue specific, suggesting context specificity of the trait etiology. Colocalized significant associations in unexpected tissues underscore the need for an agnostic scanning of multiple contexts to improve our ability to detect causal regulatory mechanisms. Monogenic disease genes are enriched among significant associations for related traits, suggesting that smaller alterations of these genes may cause a spectrum of milder phenotypes.Phenotypic variation and diseases are influenced by factors such as genetic variants and gene expression. Here, Barbeira et al. develop S-PrediXcan to compute PrediXcan results using summary data, and investigate the effects of gene expression variation on human phenotypes in 44 GTEx tissues and >100 phenotypes.

[1]  M. Daly,et al.  Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis , 2013, The Lancet.

[2]  Andrew P Morris,et al.  Multi-ethnic genome-wide association study identifies novel locus for type 2 diabetes susceptibility , 2016, European Journal of Human Genetics.

[3]  Pedro G. Ferreira,et al.  Transcriptome and genome sequencing uncovers functional variation in humans , 2013, Nature.

[4]  Eleazar Eskin,et al.  Local genetic effects on gene expression across 44 human tissues , 2016, bioRxiv.

[5]  N. Cox,et al.  Trait-Associated SNPs Are More Likely to Be eQTLs: Annotation to Enhance Discovery from GWAS , 2010, PLoS genetics.

[6]  Charles Y. Chiu,et al.  Erratum to: Clinical metagenomic identification of Balamuthia mandrillaris encephalitis and assembly of the draft genome: the continuing case for reference genome sequencing , 2016, Genome Medicine.

[7]  Robert L. Grossman,et al.  Bionimbus: a cloud for managing, analyzing and sharing large genomics datasets , 2014, J. Am. Medical Informatics Assoc..

[8]  Yang I Li,et al.  An Expanded View of Complex Traits: From Polygenic to Omnigenic , 2017, Cell.

[9]  Ayellet V. Segrè,et al.  Colocalization of GWAS and eQTL Signals Detects Target Genes , 2016, bioRxiv.

[10]  Alan M. Kwong,et al.  Next-generation genotype imputation service and methods , 2016, Nature Genetics.

[11]  Ricardo Villamarín-Salomón,et al.  ClinVar: public archive of interpretations of clinically relevant variants , 2015, Nucleic Acids Res..

[12]  Tanya M. Teslovich,et al.  Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes , 2012, Nature Genetics.

[13]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[14]  M. Peters,et al.  Systematic identification of trans eQTLs as putative drivers of known disease associations , 2013, Nature Genetics.

[15]  Kaanan P. Shah,et al.  Integrative cross tissue analysis of gene expression identifies novel type 2 diabetes genes , 2017, bioRxiv.

[16]  Xiang Zhou,et al.  Polygenic Modeling with Bayesian Sparse Linear Mixed Models , 2012, PLoS genetics.

[17]  Roby Joehanes,et al.  Identification of common genetic variants controlling transcript isoform variation in human whole blood , 2015, Nature Genetics.

[18]  Jian Yang,et al.  Predicting gene targets from integrative analyses of summary data from GWAS and eQTL studies for 28 human complex traits , 2016, Genome Medicine.

[19]  Xia Yang,et al.  Sherlock: detecting gene-disease associations by matching patterns of expression QTL and GWAS. , 2013, American journal of human genetics.

[20]  E. Dermitzakis,et al.  Candidate Causal Regulatory Effects by Integration of Expression QTLs with Complex Trait Genetic Associations , 2010, PLoS genetics.

[21]  Christie M. Ballantyne,et al.  Lipid lowering with PCSK9 inhibitors , 2014, Nature Reviews Cardiology.

[22]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[23]  Kaanan P. Shah,et al.  A gene-based association method for mapping traits using reference transcriptome data , 2015, Nature Genetics.

[24]  P. Deloukas,et al.  Patterns of Cis Regulatory Variation in Diverse Human Populations , 2012, PLoS genetics.

[25]  Stephane E. Castel,et al.  Modified penetrance of coding variants by cis-regulatory variation shapes human traits , 2017, bioRxiv.

[26]  X. Wen,et al.  Integrating molecular QTL data into genome-wide genetic association analysis: Probabilistic assessment of enrichment and colocalization , 2016, bioRxiv.

[27]  Alexander Gusev,et al.  Integrating Gene Expression with Summary Association Statistics to Identify Genes Associated with 30 Complex Traits. , 2017, American journal of human genetics.

[28]  Carson C Chow,et al.  Second-generation PLINK: rising to the challenge of larger and richer datasets , 2014, GigaScience.

[29]  N. Shah,et al.  Identification of misclassified ClinVar variants using disease population prevalence , 2016, bioRxiv.

[30]  Hae Kyung Im,et al.  Survey of the Heritability and Sparse Architecture of Gene Expression Traits across Human Tissues , 2016, bioRxiv.

[31]  J. Knight,et al.  24. WELLCOME TRUST CENTRE FOR HUMAN GENETICS , 2005 .

[32]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Shane A. McCarthy,et al.  Reference-based phasing using the Haplotype Reference Consortium panel , 2016, Nature Genetics.

[34]  Han Xu,et al.  Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. , 2014, American journal of human genetics.

[35]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[36]  Tom Michoel,et al.  Cardiometabolic risk loci share downstream cis- and trans-gene regulation across tissues and diseases , 2016, Science.

[37]  C. Wallace,et al.  Bayesian Test for Colocalisation between Pairs of Genetic Association Studies Using Summary Statistics , 2013, PLoS genetics.

[38]  Ellen T. Gelfand,et al.  The Genotype-Tissue Expression (GTEx) project , 2013, Nature Genetics.

[39]  N. Risch,et al.  Genome-wide association analyses using electronic health records identify new loci influencing blood pressure variation , 2016, Nature Genetics.

[40]  D. Koller,et al.  Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals , 2013, Genome research.

[41]  Olle Melander,et al.  From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus , 2010, Nature.

[42]  真田 昌 骨髄異形成症候群のgenome-wide analysis , 2013 .

[43]  V. P. Overman National Institute of Dental and Craniofacial Research. , 2008, International journal of dental hygiene.

[44]  David A. Knowles,et al.  RNA splicing is a primary link between genetic variation and disease , 2016, Science.

[45]  Giulio Genovese,et al.  Schizophrenia risk from complex variation of complement component 4 , 2016, Nature.

[46]  Alan M. Kwong,et al.  A reference panel of 64,976 haplotypes for genotype imputation , 2015, Nature Genetics.

[47]  P. Visscher,et al.  Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets , 2016, Nature Genetics.

[48]  Amalio Telenti,et al.  Identification of misclassified ClinVar variants using disease population prevalence , 2016, bioRxiv.

[49]  T. Lehtimäki,et al.  Integrative approaches for large-scale transcriptome-wide association studies , 2015, Nature Genetics.

[50]  J. Danesh,et al.  Large-scale association analysis identifies new risk loci for coronary artery disease , 2013 .

[51]  Eran Segal,et al.  Robust Prediction of Expression Differences among Human Individuals Using Only Genotype Information , 2013, PLoS genetics.