PancanQTL: systematic identification of cis-eQTLs and trans-eQTLs in 33 cancer types

Abstract Expression quantitative trait locus (eQTL) analysis, which links variations in gene expression to genotypes, is essential to understanding gene regulation and to interpreting disease-associated loci. Currently identified eQTLs are mainly in samples of blood and other normal tissues. However, no database comprehensively provides eQTLs in large number of cancer samples. Using the genotype and expression data of 9196 tumor samples in 33 cancer types from The Cancer Genome Atlas (TCGA), we identified 5 606 570 eQTL-gene pairs in the cis-eQTL analysis and 231 210 eQTL-gene pairs in the trans-eQTL analysis. We further performed survival analysis and identified 22 212 eQTLs associated with patient overall survival. Furthermore, we linked the eQTLs to genome-wide association studies (GWAS) data and identified 337 131 eQTLs that overlap with existing GWAS loci. We developed PancanQTL, a user-friendly database (http://bioinfo.life.hust.edu.cn/PancanQTL/), to store cis-eQTLs, trans-eQTLs, survival-associated eQTLs and GWAS-related eQTLs to enable searching, browsing and downloading. PancanQTL could help the research community understand the effects of inherited variants in tumorigenesis and development.

[1]  Fred A. Wright,et al.  seeQTL: a searchable database for human eQTLs , 2011, Bioinform..

[2]  P. Donnelly,et al.  A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies , 2009, PLoS genetics.

[3]  Chun Jimmie Ye,et al.  Accurate Discovery of Expression Quantitative Trait Loci Under Confounding From Spurious and Genuine Regulatory Hotspots , 2008, Genetics.

[4]  P. Visscher,et al.  Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets , 2016, Nature Genetics.

[5]  Wei Zhang,et al.  SCAN database: facilitating integrative analyses of cytosine modification and expression QTL , 2015, Database J. Biol. Databases Curation.

[6]  Jun S. Liu,et al.  The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans , 2015, Science.

[7]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[8]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[9]  B. Shastry SNPs: impact on gene function and phenotype. , 2009, Methods in molecular biology.

[10]  John D. Storey,et al.  Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis , 2007, PLoS genetics.

[11]  Pedro G. Ferreira,et al.  Transcriptome and genome sequencing uncovers functional variation in humans , 2013, Nature.

[12]  John Moult,et al.  Consensus Genome-Wide Expression Quantitative Trait Loci and Their Relationship with Human Complex Trait Disease. , 2016, Omics : a journal of integrative biology.

[13]  M. Peters,et al.  Systematic identification of trans eQTLs as putative drivers of known disease associations , 2013, Nature Genetics.

[14]  Emmanouil T. Dermitzakis,et al.  Putative cis-regulatory drivers in colorectal cancer , 2014, Nature.

[15]  P. Visscher,et al.  10 Years of GWAS Discovery: Biology, Function, and Translation. , 2017, American journal of human genetics.

[16]  Helen E. Parkinson,et al.  The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog) , 2016, Nucleic Acids Res..

[17]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[18]  Jan Graffelman,et al.  Exploring Diallelic Genetic Markers: The HardyWeinberg Package , 2015 .

[19]  Andrey A. Shabalin,et al.  Matrix eQTL: ultra fast eQTL analysis via large matrix operations , 2011, Bioinform..

[20]  Robert Bjornson,et al.  Large-Scale trans-eQTLs Affect Hundreds of Transcripts and Mediate Patterns of Transcriptional Co-regulation. , 2017, American journal of human genetics.

[21]  Ash A. Alizadeh,et al.  Abstract PR09: The prognostic landscape of genes and infiltrating immune cells across human cancers , 2015 .

[22]  E. Lander,et al.  Lessons from the Cancer Genome , 2013, Cell.

[23]  A global reference for human genetic variation , 2015, Nature.

[24]  Simon C. Potter,et al.  Mapping cis- and trans-regulatory effects across multiple tissues in twins , 2012, Nature Genetics.

[25]  L. Kruglyak,et al.  The role of regulatory variation in complex traits and disease , 2015, Nature Reviews Genetics.

[26]  Chun-Jie Liu,et al.  LNCediting: a database for functional effects of RNA editing in lncRNAs , 2016, Nucleic Acids Res..

[27]  N. Schork,et al.  Single nucleotide polymorphisms and the future of genetic epidemiology , 2000, Clinical genetics.

[28]  MoultJohn,et al.  Consensus Genome-Wide Expression Quantitative Trait Loci and Their Relationship with Human Complex Trait Disease. , 2016 .

[29]  Andrew D. Johnson,et al.  SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap , 2008, Bioinform..

[30]  E. Dermitzakis,et al.  Candidate Causal Regulatory Effects by Integration of Expression QTLs with Complex Trait Genetic Associations , 2010, PLoS genetics.

[31]  Liming Liang,et al.  A cross-platform analysis of 14,177 expression quantitative trait loci derived from lymphoblastoid cell lines , 2013, Genome research.

[32]  Colin N. Dewey,et al.  RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome , 2011, BMC Bioinformatics.

[33]  Peggy Hall,et al.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations , 2013, Nucleic Acids Res..

[34]  Wen Tan,et al.  Genome-wide association study identifies five loci associated with susceptibility to pancreatic cancer in Chinese populations , 2011, Nature Genetics.

[35]  R. Durbin,et al.  Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses , 2012, Nature Protocols.