Flexible analysis of TSS mapping data and detection of TSS shifts with TSRexploreR

Heterogeneity in transcription initiation has important consequences for transcript stability and translation, and shifts in transcription start site (TSS) usage are prevalent in various disease and developmental contexts. Accordingly, numerous methods for global TSS profiling have been developed, including our recently published Survey of TRanscription Initiation at Promoter Elements with high-throughput sequencing (STRIPE-seq), a method to profile transcription start sites (TSSs) on a genome-wide scale with minimal cost and time. In parallel to our development of STRIPE-seq, we built TSRexploreR, an R package for end-to-end analysis of TSS mapping data. TSRexploreR provides functions for TSS and TSR detection, normalization, correlation, visualization, and differential TSS/TSR analysis. TSRexploreR is highly interoperable, accepting the data structures of TSS and TSR sets generated by several existing tools for processing and alignment of TSS mapping data, such as CAGEr for Cap Analysis of Gene Expression (CAGE) data. Lastly, TSRexploreR implements a novel approach for the detection of shifts in TSS distribution.

[1]  N. Luscombe,et al.  High-resolution analysis of cell-state transitions in yeast suggests widespread transcriptional tuning by alternative starts , 2021, Genome biology.

[2]  B. Lenhard,et al.  Global regulatory transitions at core promoters demarcate the mammalian germline cycle , 2020, bioRxiv.

[3]  B. Lenhard,et al.  TBPL2/TFIIA complex establishes the maternal transcriptome by an oocyte-specific promoter usage , 2020, bioRxiv.

[4]  Charles D. Johnson,et al.  Universal promoter scanning by Pol II during transcription initiation in Saccharomyces cerevisiae , 2020, Genome Biology.

[5]  V. Brendel,et al.  Simple and efficient profiling of transcription initiation and transcript levels with STRIPE-seq , 2020, Genome research.

[6]  Zhenguo Lin,et al.  The origin and evolution of a distinct mechanism of transcription initiation in yeasts , 2020, bioRxiv.

[7]  R. Irizarry ggplot2 , 2019, Introduction to Data Science.

[8]  Ivan R. Corrêa,et al.  Non-templated addition and template switching by Moloney murine leukemia virus (MMLV)-based reverse transcriptases co-occur and compete with each other , 2019, The Journal of Biological Chemistry.

[9]  A. Sandelin,et al.  CAGEfightR: analysis of 5′-end data using R/Bioconductor , 2019, BMC Bioinformatics.

[10]  Gunnar Rätsch,et al.  A Pan-cancer Transcriptome Analysis Reveals Pervasive Regulation through Alternative Promoters , 2019, Cell.

[11]  A. Akhtar,et al.  MAPCap allows high-resolution detection and differential expression analysis of transcription start sites , 2019, Nature Communications.

[12]  P. Bucher,et al.  Opposing chromatin remodelers control transcription initiation frequency and start site selection , 2019, Nature Structural & Molecular Biology.

[13]  N. Friedman,et al.  Dynamics of Chromatin and Transcription during Transient Depletion of the RSC Chromatin Remodeling Complex , 2019, Cell reports.

[14]  Tae-Hyuk Ahn,et al.  YeasTSS: an integrative web database of yeast transcription start sites , 2019, bioRxiv.

[15]  Zhenguo Lin,et al.  Pervasive and dynamic transcription initiation in Saccharomyces cerevisiae , 2018, bioRxiv.

[16]  Piero Carninci,et al.  SLIC-CAGE: high-resolution transcription start site mapping using nanogram-levels of total RNA , 2018, bioRxiv.

[17]  Anders Gorm Pedersen,et al.  Characterization of the enhancer and promoter landscape of inflammatory bowel disease from human colon biopsies , 2018, Nature Communications.

[18]  Wolfgang Huber,et al.  Alternative start and termination sites of transcription drive most transcript isoform differences across human tissues , 2017, Nucleic acids research.

[19]  Kousuke Hanada,et al.  Light Controls Protein Localization through Phytochrome-Mediated Alternative Promoter Selection , 2017, Cell.

[20]  Omar Wagih,et al.  ggseqlogo: a versatile R package for drawing sequence logos , 2017, Bioinform..

[21]  Roland Eils,et al.  Complex heatmaps reveal patterns and correlations in multidimensional genomic data , 2016, Bioinform..

[22]  M. Wang,et al.  Ubiquitously expressed genes participate in cell‐specific functions via alternative promoter usage , 2016, EMBO reports.

[23]  Sean Davis,et al.  Statistical Genomics. Methods and Protocols. , 2016, Anticancer research.

[24]  Florian Hahne,et al.  Visualizing Genomic Data Using Gviz and Bioconductor , 2016, Statistical Genomics.

[25]  Jason S. Cumbie,et al.  NanoCAGE-XL and CapFilter: an approach to genome wide identification of high confidence transcription start sites , 2015, BMC Genomics.

[26]  Qing-Yu He,et al.  ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization , 2015, Bioinform..

[27]  Christophe Malabat,et al.  Quality control of transcription start site selection by nonsense-mediated-mRNA decay , 2015, eLife.

[28]  Boris Lenhard,et al.  CAGEr: precise TSS data retrieval and high-resolution promoterome mining for integrative analyses , 2015, Nucleic acids research.

[29]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[30]  André L. Martins,et al.  Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers , 2014, Nature Genetics.

[31]  Nan Li,et al.  Two independent transcription initiation codes overlap on vertebrate core promoters , 2014, Nature.

[32]  Piero Carninci,et al.  Detecting expressed genes using CAGE. , 2014, Methods in molecular biology.

[33]  Boris Lenhard,et al.  Dynamic regulation of the transcription initiation landscape at single nucleotide resolution during vertebrate embryogenesis , 2013, Genome research.

[34]  Robert Gentleman,et al.  Software for Computing and Annotating Genomic Ranges , 2013, PLoS Comput. Biol..

[35]  Joshua A. Arribere,et al.  Roles for transcript leaders in translation and mRNA decay revealed by transcript leader sequencing , 2013, Genome research.

[36]  Piero Carninci,et al.  High-fidelity promoter profiling reveals widespread alternative promoter usage and transposon-driven developmental gene expression , 2013, Genome research.

[37]  W. Gilbert,et al.  Alternative transcription start site selection leads to large differences in translation activity in yeast. , 2012, RNA.

[38]  Guangchuang Yu,et al.  clusterProfiler: an R package for comparing biological themes among gene clusters. , 2012, Omics : a journal of integrative biology.

[39]  A. Sandelin,et al.  Metazoan promoters: emerging characteristics and insights into transcriptional regulation , 2012, Nature Reviews Genetics.

[40]  Carsten Wiuf,et al.  Tumor-specific usage of alternative transcription start sites in colorectal cancer identified by genome-wide exon array analysis , 2011, BMC Genomics.

[41]  Hyunsoo Kim,et al.  Alternative transcription exceeds alternative splicing in generating the transcriptome diversity of cerebellar development. , 2011, Genome research.

[42]  Piero Carninci,et al.  Genome-wide analysis of promoter architecture in Drosophila melanogaster. , 2011, Genome research.

[43]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[44]  M. Robinson,et al.  A scaling normalization method for differential expression analysis of RNA-seq data , 2010, Genome Biology.

[45]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[46]  Martin S. Taylor,et al.  Genome-wide analysis of mammalian promoter architecture and evolution , 2006, Nature Genetics.

[47]  F. Dietrich,et al.  Mapping of transcription start sites in Saccharomyces cerevisiae using 5′ SAGE , 2005, Nucleic acids research.

[48]  D. Botstein,et al.  Genomic expression programs in the response of yeast cells to environmental changes. , 2000, Molecular biology of the cell.

[49]  Leonidas J. Guibas,et al.  A metric for distributions with applications to image databases , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).