Opportunities and challenges for transcriptome-wide association studies

Transcriptome-wide association studies (TWAS) integrate genome-wide association studies (GWAS) and gene expression datasets to identify gene–trait associations. In this Perspective, we explore properties of TWAS as a potential approach to prioritize causal genes at GWAS loci, by using simulations and case studies of literature-curated candidate causal genes for schizophrenia, low-density-lipoprotein cholesterol and Crohn’s disease. We explore risk loci where TWAS accurately prioritizes the likely causal gene as well as loci where TWAS prioritizes multiple genes, some likely to be non-causal, owing to sharing of expression quantitative trait loci (eQTL). TWAS is especially prone to spurious prioritization with expression data from non-trait-related tissues or cell types, owing to substantial cross-cell-type variation in expression levels and eQTL strengths. Nonetheless, TWAS prioritizes candidate causal genes more accurately than simple baselines. We suggest best practices for causal-gene prioritization with TWAS and discuss future opportunities for improvement. Our results showcase the strengths and limitations of using eQTL datasets to determine causal genes at GWAS loci.Transcriptome-wide association studies (TWAS) prioritize candidate causal genes at GWAS loci. This Perspective discusses the challenges to TWAS analysis, caveats to interpretation of results and opportunities for improvements to this class of methods.

[1]  Benjamin Neale,et al.  Transcriptome-wide association study of schizophrenia and chromatin activity yields mechanistic disease insights , 2016 .

[2]  T. Heskes,et al.  The statistical properties of gene-set analysis , 2016, Nature Reviews Genetics.

[3]  E. Dermitzakis,et al.  Candidate Causal Regulatory Effects by Integration of Expression QTLs with Complex Trait Genetic Associations , 2010, PLoS genetics.

[4]  Milton Pividori,et al.  Integrating predicted transcriptome from multiple tissues improves association detection , 2018, bioRxiv.

[5]  E. Lander,et al.  Local regulation of gene expression by lncRNA promoters, transcription and splicing , 2016, Nature.

[6]  Hae Kyung Im,et al.  Genetic architecture of gene expression traits across diverse populations , 2018, bioRxiv.

[7]  Olle Melander,et al.  From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus , 2010, Nature.

[8]  Alexander Gusev,et al.  Integrating Gene Expression with Summary Association Statistics to Identify Genes Associated with 30 Complex Traits. , 2017, American journal of human genetics.

[9]  Tom Michoel,et al.  Cardiometabolic risk loci share downstream cis- and trans-gene regulation across tissues and diseases , 2016, Science.

[10]  Alexander F. Palazzo,et al.  Non-coding RNA: what is functional and what is junk? , 2015, Front. Genet..

[11]  Judy H. Cho,et al.  Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations , 2015, Nature Genetics.

[12]  Nicola J. Rinaldi,et al.  Genetic effects on gene expression across human tissues , 2017, Nature.

[13]  P. Visscher,et al.  Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets , 2016, Nature Genetics.

[14]  Todd L Edwards,et al.  Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics , 2018, Nature Communications.

[15]  Kaanan P. Shah,et al.  A gene-based association method for mapping traits using reference transcriptome data , 2015, Nature Genetics.

[16]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[17]  C. Wallace,et al.  Bayesian Test for Colocalisation between Pairs of Genetic Association Studies Using Summary Statistics , 2013, PLoS genetics.

[18]  Jian Yang,et al.  Predicting gene targets from integrative analyses of summary data from GWAS and eQTL studies for 28 human complex traits , 2016, Genome Medicine.

[19]  Xia Yang,et al.  Sherlock: detecting gene-disease associations by matching patterns of expression QTL and GWAS. , 2013, American journal of human genetics.

[20]  Fabian J Theis,et al.  The Human Cell Atlas , 2017, bioRxiv.

[21]  Alexander Gusev,et al.  Transcriptome-wide association study of schizophrenia and chromatin activity yields mechanistic disease insights , 2016, Nature Genetics.

[22]  O. Delaneau,et al.  Estimating the causal tissues for complex traits and diseases , 2016, Nature Genetics.

[23]  Alireza F. Siahpirani,et al.  Imputed gene associations identify replicable trans‐acting genes enriched in transcription pathways and complex traits , 2018, bioRxiv.

[24]  Eric E Schadt,et al.  Large-Scale Identification of Common Trait and Disease Variants Affecting Gene Expression. , 2017, American journal of human genetics.

[25]  Loukas Moutsianas,et al.  Exploring the genetic architecture of inflammatory bowel disease , 2016 .

[26]  Wei Pan,et al.  A Powerful Framework for Integrating eQTL and GWAS Summary Data , 2017, Genetics.

[27]  A. Chen-Plotkin,et al.  The Post-GWAS Era: From Association to Function. , 2018, American journal of human genetics.

[28]  Simon C. Potter,et al.  Mapping cis- and trans-regulatory effects across multiple tissues in twins , 2012, Nature Genetics.

[29]  Manolis Kellis,et al.  FTO Obesity Variant Circuitry and Adipocyte Browning in Humans. , 2015, The New England journal of medicine.

[30]  Sina A. Gharib,et al.  Unraveling the polygenic architecture of complex traits using blood eQTL metaanalysis , 2018, bioRxiv.

[31]  Howard Y. Chang,et al.  NONCODING RNA: CRISPRi‐based genome‐scale identification of functional long noncoding RNA loci in human cells , 2017 .

[32]  John A. Todd,et al.  Statistical colocalization of monocyte gene expression and genetic risk variants for type 1 diabetes , 2012, Human molecular genetics.

[33]  Manolis Kellis,et al.  Modeling prediction error improves power of transcriptome-wide association studies , 2017, bioRxiv.

[34]  Evan Z. Macosko,et al.  Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types , 2017, Nature Genetics.

[35]  Hongyu Zhao,et al.  A statistical framework for cross-tissue transcriptome-wide association analysis , 2018, Nature Genetics.

[36]  M. Schober,et al.  Challenges and Strategies , 2016 .

[37]  S. Purcell,et al.  Pleiotropy in complex traits: challenges and strategies , 2013, Nature Reviews Genetics.

[38]  T. Lehtimäki,et al.  Integrative approaches for large-scale transcriptome-wide association studies , 2015, Nature Genetics.

[39]  X. Wen,et al.  Integrating molecular QTL data into genome-wide genetic association analysis: Probabilistic assessment of enrichment and colocalization , 2016, bioRxiv.

[40]  John A. Todd,et al.  Statistical independence of the colocalized association signals for type 1 diabetes and RPS26 gene expression on chromosome 12q13 , 2008, Biostatistics.

[41]  Tanya M. Teslovich,et al.  Discovery and refinement of loci associated with lipid levels , 2013, Nature Genetics.

[42]  A. Gusev,et al.  Probabilistic fine-mapping of transcriptome-wide association studies , 2017, Nature Genetics.

[43]  Giulio Genovese,et al.  Schizophrenia risk from complex variation of complement component 4 , 2016, Nature.

[44]  R. Durbin,et al.  Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses , 2012, Nature Protocols.

[45]  Yang I Li,et al.  An Expanded View of Complex Traits: From Polygenic to Omnigenic , 2017, Cell.

[46]  Ayellet V. Segrè,et al.  Colocalization of GWAS and eQTL Signals Detects Target Genes , 2016, bioRxiv.