In-Depth Transcriptome Analysis Reveals Novel TARs and Prevalent Antisense Transcription in Human Cell Lines

Several recent studies have indicated that transcription is pervasive in regions outside of protein coding genes and that short antisense transcripts can originate from the promoter and terminator regions of genes. Here we investigate transcription of fragments longer than 200 nucleotides, focusing on antisense transcription for known protein coding genes and intergenic transcription. We find that roughly 12% to 16% of all reads that originate from promoter and terminator regions, respectively, map antisense to the gene in question. Furthermore, we detect a high number of novel transcriptionally active regions (TARs) that are generally expressed at a lower level than protein coding genes. We find that the correlation between RNA-seq data and microarray data is dependent on the gene length, with longer genes showing a better correlation. We detect high antisense transcriptional activity from promoter, terminator and intron regions of protein-coding genes and identify a vast number of previously unidentified TARs, including putative novel EGFR transcripts. This shows that in-depth analysis of the transcriptome using RNA-seq is a valuable tool for understanding complex transcriptional events. Furthermore, the development of new algorithms for estimation of gene expression from RNA-seq data is necessary to minimize length bias.

[1]  Daniel J. Blankenberg,et al.  Galaxy: a platform for interactive large-scale genome analysis. , 2005, Genome research.

[2]  P. Khaitovich,et al.  BMC Genomics BioMed Central Methodology article Estimating accuracy of RNA-Seq and microarrays with proteomics , 2022 .

[3]  Mikkel H. Schierup,et al.  RNA Exosome Depletion Reveals Transcription Upstream of Active Human Promoters , 2008, Science.

[4]  K. Kinzler,et al.  The Antisense Transcriptomes of Human Cells , 2008, Science.

[5]  R. Evans,et al.  Expression cloning of human EGF receptor complementary DNA: gene amplification and three related messenger RNA products in A431 cells. , 1984, Science.

[6]  Yitzhak Pilpel,et al.  Genome‐wide natural antisense transcription: coupling its regulation to its different regulatory mechanisms , 2006, EMBO reports.

[7]  Gene W. Yeo,et al.  Divergent Transcription from Active Promoters , 2008, Science.

[8]  S. Batalov,et al.  Antisense Transcription in the Mammalian Transcriptome , 2005, Science.

[9]  A. Orlacchio,et al.  MicroRNA Implications across Neurodevelopment and Neuropathology , 2009, Journal of biomedicine & biotechnology.

[10]  A. Oshlack,et al.  Transcript length bias in RNA-seq data confounds systems biology , 2009, Biology Direct.

[11]  Leighton J. Core,et al.  Nascent RNA Sequencing Reveals Widespread Pausing and Divergent Initiation at Human Promoters , 2008, Science.

[12]  Jan Gorodkin,et al.  Fast Pairwise Structural RNA Alignments by Pruning of the Dynamical Programming Matrix , 2007, PLoS Comput. Biol..

[13]  Miki Ebisuya,et al.  Ripples from neighbouring transcription , 2008, Nature Cell Biology.

[14]  F. Pontén,et al.  Correlations between RNA and protein expression profiles in 23 human cell lines , 2009, BMC Genomics.

[15]  J. Lundeberg,et al.  The plasticity of the mammalian transcriptome. , 2010, Genomics.

[16]  X. Adiconis,et al.  PPARGC1A Variation Associated With DNA Damage, Diabetes, and Cardiovascular Diseases , 2008, Diabetes.

[17]  Erez Y. Levanon,et al.  Widespread occurrence of antisense transcription in the human genome , 2003, Nature Biotechnology.

[18]  Sergio Verjovski-Almeida,et al.  Genome mapping and expression analyses of human intronic noncoding RNAs reveal tissue-specific patterns and enrichment in genes related to regulation of transcription , 2007, Genome Biology.

[19]  J. Lundeberg,et al.  Automation of cDNA Synthesis and Labelling Improves Reproducibility , 2009, Journal of biomedicine & biotechnology.

[20]  J. Kawai,et al.  Tiny RNAs associated with transcription start sites in animals , 2009, Nature Genetics.

[21]  Xiaoqiu Huang,et al.  Over 20% of human transcripts might form sense-antisense pairs. , 2004, Nucleic acids research.

[22]  Sergio Verjovski-Almeida,et al.  Long intronic noncoding RNA transcription: expression noise or expression choice? , 2009, Genomics.

[23]  B. Williams,et al.  Mapping and quantifying mammalian transcriptomes by RNA-Seq , 2008, Nature Methods.

[24]  E. Lundberg,et al.  Toward a Confocal Subcellular Atlas of the Human Proteome*S , 2008, Molecular & Cellular Proteomics.

[25]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.