CancerSplicingQTL: a database for genome-wide identification of splicing QTLs in human cancer

Abstract Alternative splicing (AS) is a widespread process that increases structural transcript variation and proteome diversity. Aberrant splicing patterns are frequently observed in cancer initiation, progress, prognosis and therapy. Increasing evidence has demonstrated that AS events could undergo modulation by genetic variants. The identification of splicing quantitative trait loci (sQTLs), genetic variants that affect AS events, might represent an important step toward fully understanding the contribution of genetic variants in disease development. However, no database has yet been developed to systematically analyze sQTLs across multiple cancer types. Using genotype data from The Cancer Genome Atlas and corresponding AS values calculated by TCGASpliceSeq, we developed a computational pipeline to identify sQTLs from 9 026 tumor samples in 33 cancer types. We totally identified 4 599 598 sQTLs across all cancer types. We further performed survival analyses and identified 17 072 sQTLs associated with patient overall survival times. Furthermore, using genome-wide association study (GWAS) catalog data, we identified 1 180 132 sQTLs overlapping with known GWAS linkage disequilibrium regions. Finally, we constructed a user-friendly database, CancerSplicingQTL (http://www.cancersplicingqtl-hust.com/) for users to conveniently browse, search and download data of interest. This database provides an informative sQTL resource for further characterizing the potential functional roles of SNPs that control transcript isoforms in human cancer.

[1]  Robert Brown,et al.  TCGASpliceSeq a compendium of alternative mRNA splicing in cancer , 2015, Nucleic Acids Res..

[2]  J. Long,et al.  A Comprehensive cis-eQTL Analysis Revealed Target Genes in Breast Cancer Susceptibility Loci Identified in Genome-wide Association Studies. , 2018, American journal of human genetics.

[3]  Michael D. Wilson,et al.  The Evolutionary Landscape of Alternative Splicing in Vertebrate Species , 2012, Science.

[4]  Yang Yang,et al.  A Rare Variant P507L in TPP1 Interrupts TPP1–TIN2 Interaction, Influences Telomere Length, and Confers Colorectal Cancer Risk in Chinese Population , 2018, Cancer Epidemiology, Biomarkers & Prevention.

[5]  Jun S. Liu,et al.  The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans , 2015, Science.

[6]  Emmanouil T. Dermitzakis,et al.  Putative cis-regulatory drivers in colorectal cancer , 2014, Nature.

[7]  Helen E. Parkinson,et al.  The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog) , 2016, Nucleic Acids Res..

[8]  Tom R. Gaunt,et al.  Systematic identification of genetic influences on methylation across the human life course , 2016, Genome Biology.

[9]  Jiang Chang,et al.  Integrative expression quantitative trait locus-based analysis of colorectal cancer identified a functional polymorphism regulating SLC22A5 expression. , 2018, European journal of cancer.

[10]  Eric T. Wang,et al.  Alternative Isoform Regulation in Human Tissue Transcriptomes , 2008, Nature.

[11]  Roby Joehanes,et al.  Identification of common genetic variants controlling transcript isoform variation in human whole blood , 2015, Nature Genetics.

[12]  Andrew D. Johnson,et al.  SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap , 2008, Bioinform..

[13]  W. Symmans,et al.  Alternative splicing disrupts a nuclear localization signal in spleen tyrosine kinase that is required for invasion suppression in breast cancer. , 2003, Cancer research.

[14]  Zhao Zhang,et al.  PancanQTL: systematic identification of cis-eQTLs and trans-eQTLs in 33 cancer types , 2017, Nucleic Acids Res..

[15]  E. Lander,et al.  Lessons from the Cancer Genome , 2013, Cell.

[16]  Patrick F. Sullivan,et al.  High density methylation QTL analysis in human blood via next-generation sequencing of the methylated genomic DNA fraction , 2015, Genome Biology.

[17]  M. Peters,et al.  Systematic identification of trans eQTLs as putative drivers of known disease associations , 2013, Nature Genetics.

[18]  Benno Pütz,et al.  Genome-wide mapping of genetic determinants influencing DNA methylation and gene expression in human hippocampus , 2017, Nature Communications.

[19]  Sooyoung Cho,et al.  ASpedia: a comprehensive encyclopedia of human alternative splicing , 2017, Nucleic Acids Res..

[20]  T. Cooper,et al.  Pre-mRNA splicing and human disease. , 2003, Genes & development.

[21]  X. Miao,et al.  A polymorphic MYC response element in KBTBD11 influences colorectal cancer risk, especially in interaction with an MYC-regulated SNP rs6983267 , 2017, Annals of oncology : official journal of the European Society for Medical Oncology.

[22]  Atsushi Takata,et al.  Genome-wide identification of splicing QTLs in the human brain and their enrichment among schizophrenia-associated loci , 2017, Nature Communications.

[23]  Ellen T. Gelfand,et al.  The Genotype-Tissue Expression (GTEx) project , 2013, Nature Genetics.

[24]  Andrey A. Shabalin,et al.  Matrix eQTL: ultra fast eQTL analysis via large matrix operations , 2011, Bioinform..

[25]  B. Shastry SNPs: impact on gene function and phenotype. , 2009, Methods in molecular biology.

[26]  E. Volpe,et al.  The Splicing Factor PTBP1 Promotes Expression of Oncogenic Splice Variants and Predicts Poor Prognosis in Patients with Non–muscle-Invasive Bladder Cancer , 2018, Clinical Cancer Research.

[27]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[28]  R. Lothe,et al.  Aberrant RNA splicing in cancer; expression changes and driver mutations of splicing factor genes , 2016, Oncogene.

[29]  Yang Yang,et al.  A Rare Missense Variant in TCF7L2 Associates with Colorectal Cancer Risk by Interacting with a GWAS-Identified Regulatory Variant in the MYC Enhancer. , 2018, Cancer research.

[30]  Mingyao Li,et al.  Mapping Splicing Quantitative Trait Loci in RNA-Seq , 2014, Cancer informatics.

[31]  R. Durbin,et al.  Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses , 2012, Nature Protocols.

[32]  E. Dermitzakis,et al.  Alternative Splicing QTLs in European and African Populations , 2015, American journal of human genetics.

[33]  Emilie Lalonde,et al.  RNA sequencing reveals the role of splicing polymorphisms in regulating human gene expression. , 2011, Genome research.

[34]  A. Chen-Plotkin,et al.  The Post-GWAS Era: From Association to Function. , 2018, American journal of human genetics.

[35]  O. Abdel-Wahab,et al.  Aberrant RNA Splicing in Cancer. , 2019, Annual review of cancer biology.

[36]  J. Fackenthal,et al.  Aberrant RNA splicing and its functional consequences in cancer cells , 2008, Disease Models & Mechanisms.