Recurrent tumor-specific regulation of alternative polyadenylation of cancer-related genes

BackgroundAlternative polyadenylation (APA) results in messenger RNA molecules with different 3′ untranslated regions (3’ UTRs), affecting the molecules’ stability, localization, and translation. APA is pervasive and implicated in cancer. Earlier reports on APA focused on 3’ UTR length modifications and commonly characterized APA events as 3’ UTR shortening or lengthening. However, such characterization oversimplifies the processing of 3′ ends of transcripts and fails to adequately describe the various scenarios we observe.ResultsWe built a cloud-based targeted de novo transcript assembly and analysis pipeline that incorporates our previously developed cleavage site prediction tool, KLEAT. We applied this pipeline to elucidate the APA profiles of 114 genes in 9939 tumor and 729 tissue normal samples from The Cancer Genome Atlas (TCGA). The full set of 10,668 RNA-Seq samples from 33 cancer types has not been utilized by previous APA studies. By comparing the frequencies of predicted cleavage sites between normal and tumor sample groups, we identified 77 events (i.e. gene-cancer type pairs) of tumor-specific APA regulation in 13 cancer types; for 15 genes, such regulation is recurrent across multiple cancers. Our results also support a previous report showing the 3’ UTR shortening of FGF2 in multiple cancers. However, over half of the events we identified display complex changes to 3’ UTR length that resist simple classification like shortening or lengthening.ConclusionsRecurrent tumor-specific regulation of APA is widespread in cancer. However, the regulation pattern that we observed in TCGA RNA-seq data cannot be described as straightforward 3’ UTR shortening or lengthening. Continued investigation into this complex, nuanced regulatory landscape will provide further insight into its role in tumor formation and development.

[1]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[2]  Christine Mayr,et al.  Alternative 3'UTRs act as scaffolds to regulate membrane protein localization , 2015, Nature.

[3]  Colin Campbell,et al.  An integrative approach to predicting the functional effects of non-coding and coding sequence variation , 2015, Bioinform..

[4]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[5]  Inanç Birol,et al.  KLEAT: Cleavage Site Analysis of Transcriptomes , 2014, Pacific Symposium on Biocomputing.

[6]  G. Yehia,et al.  Analysis of alterative cleavage and polyadenylation by 3′ region extraction and deep sequencing , 2012, Nature Methods.

[7]  P. Sharp,et al.  Proliferating Cells Express mRNAs with Shortened 3' Untranslated Regions and Fewer MicroRNA Target Sites , 2008, Science.

[8]  Wei Li,et al.  Dynamic analyses of alternative polyadenylation from RNA-seq reveal a 3′-UTR landscape across seven tumour types , 2014, Nature Communications.

[9]  A. I. Yakimchik Jupyter Notebook: a system for interactive scientific computing , 2019 .

[10]  C. Mayr,et al.  Widespread Shortening of 3′UTRs by Alternative Cleavage and Polyadenylation Activates Oncogenes in Cancer Cells , 2009, Cell.

[11]  K. Nishida,et al.  Mechanisms and consequences of alternative polyadenylation. , 2011, Molecules and Cells.

[12]  Christine Mayr,et al.  Evolution and Biological Roles of Alternative 3'UTRs. , 2016, Trends in cell biology.

[13]  Julie L. Yang,et al.  Ubiquitously transcribed genes use alternative polyadenylation to achieve tissue-specific expression , 2013, Genes & development.

[14]  D. Gautheret,et al.  Patterns of variant polyadenylation signal usage in human genes. , 2000, Genome research.

[15]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[16]  Steven J. M. Jones,et al.  Abyss: a Parallel Assembler for Short Read Sequence Data Material Supplemental Open Access , 2022 .

[17]  Mingming Jia,et al.  COSMIC: exploring the world's knowledge of somatic mutations in human cancer , 2014, Nucleic Acids Res..

[18]  Sarah C. Ayling,et al.  The Ensembl gene annotation system , 2016, Database J. Biol. Databases Curation.

[19]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[20]  Wes McKinney,et al.  pandas: a Foundational Python Library for Data Analysis and Statistics , 2011 .

[21]  Obi L. Griffith,et al.  Statistically identifying tumor suppressors and oncogenes from pan-cancer genome-sequencing data , 2015, Bioinform..

[22]  C. Burge,et al.  3′ UTR-isoform choice has limited influence on the stability and translational efficiency of most mRNAs in mouse fibroblasts , 2013, Genome research.

[23]  Justin Chu,et al.  BioBloom tools: fast, accurate and memory-efficient host species sequence screening using bloom filters , 2014, Bioinform..

[24]  Peter J. Shepard,et al.  Complex and dynamic landscape of RNA polyadenylation revealed by PAS-Seq. , 2011, RNA.

[25]  Helmut Grubmüller,et al.  do_x3dna: a tool to analyze structural fluctuations of dsDNA or dsRNA from molecular dynamics simulations , 2015, Bioinform..

[26]  B. Tian,et al.  Progressive lengthening of 3′ untranslated regions of mRNAs by alternative polyadenylation during mouse embryonic development , 2009, Proceedings of the National Academy of Sciences.

[27]  Ran Elkon,et al.  3’UTR Shortening Potentiates MicroRNA-Based Repression of Pro-differentiation Genes in Proliferating Human Cells , 2016, PLoS genetics.

[28]  Serban Nacu,et al.  Fast and SNP-tolerant detection of complex variants and splicing in short reads , 2010, Bioinform..

[29]  B. Tian,et al.  Alternative polyadenylation of mRNA precursors , 2016, Nature Reviews Molecular Cell Biology.

[30]  Bin Tian,et al.  Comparative analysis of alternative polyadenylation in S. cerevisiae and S. pombe , 2017, Genome research.

[31]  Steven J. M. Jones,et al.  De novo assembly and analysis of RNA-seq data , 2010, Nature Methods.

[32]  D. Karolchik,et al.  The UCSC Genome Browser database: 2016 update , 2015, bioRxiv.

[33]  Scott D. Brown,et al.  Activation of an endogenous retrovirus-associated long non-coding RNA in human adenocarcinoma , 2015, Genome Medicine.

[34]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[35]  D. Bartel,et al.  Extensive alternative polyadenylation during zebrafish development , 2012, Genome research.

[36]  David L. Gibbs,et al.  The ISB Cancer Genomics Cloud: A Flexible Cloud-Based Platform for Cancer Genomics Research. , 2017, Cancer research.

[37]  S. Goff,et al.  Upf1 Senses 3′UTR Length to Potentiate mRNA Decay , 2010, Cell.

[38]  Aaron R. Quinlan,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2022 .

[39]  Bin Tian,et al.  A large-scale analysis of mRNA polyadenylation of human and mouse genes , 2005, Nucleic acids research.

[40]  Thomas Bonfert,et al.  Prediction of Poly(A) Sites by Poly(A) Read Mapping , 2017, PloS one.

[41]  Travis E. Oliphant,et al.  Python for Scientific Computing , 2007, Computing in Science & Engineering.

[42]  T. Jensen,et al.  Nonsense-mediated mRNA decay: an intricate machinery that shapes transcriptomes , 2015, Nature Reviews Molecular Cell Biology.

[43]  Ralf Schmidt,et al.  A comprehensive analysis of 3′ end sequencing data sets reveals novel polyadenylation signals and the repressive role of heterogeneous ribonucleoprotein C on cleavage and polyadenylation , 2015, bioRxiv.

[44]  B. Tian,et al.  Alternative cleavage and polyadenylation: the long and short of it. , 2013, Trends in biochemical sciences.

[45]  S. Danckwardt,et al.  Processing and transcriptome expansion at the mRNA 3′ end in health and disease: finding the right end , 2016, Pflügers Archiv - European Journal of Physiology.

[46]  T. Babak,et al.  A quantitative atlas of polyadenylation in five mammals , 2012, Genome research.

[47]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[48]  Brian E. Granger,et al.  IPython: A System for Interactive Scientific Computing , 2007, Computing in Science & Engineering.

[49]  Ran Elkon,et al.  Genome-Wide Polyadenylation Maps Reveal Dynamic mRNA 3'-End Formation in the Failing Human Heart. , 2016, Circulation research.

[50]  R. Elkon,et al.  Alternative cleavage and polyadenylation: extent, regulation and function , 2013, Nature Reviews Genetics.

[51]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[52]  Leo Goodstadt,et al.  Ruffus: a lightweight Python library for computational pipelines , 2010, Bioinform..

[53]  Thomas D. Wu,et al.  GMAP: a genomic mapping and alignment program for mRNA and EST sequence , 2005, Bioinform..

[54]  F. Supek,et al.  The rules and impact of nonsense-mediated mRNA decay in human cancers , 2016, Nature Genetics.

[55]  A. E. Erson-Bensan,et al.  Alternative Polyadenylation: Another Foe in Cancer , 2016, Molecular Cancer Research.