Sequencing Degraded RNA Addressed by 3' Tag Counting

RNA sequencing has become widely used in gene expression profiling experiments. Prior to any RNA sequencing experiment the quality of the RNA must be measured to assess whether or not it can be used for further downstream analysis. The RNA integrity number (RIN) is a scale used to measure the quality of RNA that runs from 1 (completely degraded) to 10 (intact). Ideally, samples with high RIN (8) are used in RNA sequencing experiments. RNA, however, is a fragile molecule which is susceptible to degradation and obtaining high quality RNA is often hard, or even impossible when extracting RNA from certain clinical tissues. Thus, occasionally, working with low quality RNA is the only option the researcher has. Here we investigate the effects of RIN on RNA sequencing and suggest a computational method to handle data from samples with low quality RNA which also enables reanalysis of published datasets. Using RNA from a human cell line we generated and sequenced samples with varying RINs and illustrate what effect the RIN has on the basic procedure of RNA sequencing; both quality aspects and differential expression. We show that the RIN has systematic effects on gene coverage, false positives in differential expression and the quantification of duplicate reads. We introduce 3' tag counting (3TC) as a computational approach to reliably estimate differential expression for samples with low RIN. We show that using the 3TC method in differential expression analysis significantly reduces false positives when comparing samples with different RIN, while retaining reasonable sensitivity.

[1]  Xuegong Zhang,et al.  Using non-uniform read distribution models to improve isoform expression inference in RNA-Seq , 2011, Bioinform..

[2]  Charles Auffray,et al.  Towards standardization of RNA quality assessment using user-independent classifiers of microcapillary electrophoresis traces , 2005, Nucleic acids research.

[3]  Andrew H. Beck,et al.  3′-End Sequencing for Expression Quantification (3SEQ) from Archival Tumor Samples , 2010, PloS one.

[4]  Jochen Gaedcke,et al.  Impact of RNA degradation on gene expression profiling , 2010, BMC Medical Genomics.

[5]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[6]  Wei Li,et al.  RSeQC: quality control of RNA-seq experiments , 2012, Bioinform..

[7]  T. Babak,et al.  A quantitative atlas of polyadenylation in five mammals , 2012, Genome research.

[8]  Lior Pachter,et al.  Sequence Analysis , 2020, Definitions.

[9]  Kunbin Qu,et al.  Selective Depletion of rRNA Enables Whole Transcriptome Profiling of Archival Fixed Tissue , 2012, PloS one.

[10]  B. Williams,et al.  Mapping and quantifying mammalian transcriptomes by RNA-Seq , 2008, Nature Methods.

[11]  Aviv Regev,et al.  Corrigendum: Comparative analysis of RNA sequencing methods for degraded or low-input samples , 2013, Nature Methods.

[12]  Ting Chen,et al.  Modeling RNA degradation for RNA-Seq with applications. , 2012, Biostatistics.

[13]  Peter J. Shepard,et al.  Complex and dynamic landscape of RNA polyadenylation revealed by PAS-Seq. , 2011, RNA.

[14]  A. Morley,et al.  Quantification of RNA integrity and its use for measurement of transcript number , 2012, Nucleic acids research.

[15]  Cole Trapnell,et al.  Improving RNA-Seq expression estimates by correcting for fragment bias , 2011, Genome Biology.

[16]  Marcel Martin Cutadapt removes adapter sequences from high-throughput sequencing reads , 2011 .

[17]  R. Gentleman,et al.  Independent filtering increases detection power for high-throughput experiments , 2010, Proceedings of the National Academy of Sciences.

[18]  Xiang-Dong Fu,et al.  A multiplex RNA-seq strategy to profile poly(A+) RNA: application to analysis of transcription response and 3' end formation. , 2011, Genomics.

[19]  W. Huber,et al.  Differential expression analysis for sequence count data , 2010 .

[20]  Joakim Lundeberg,et al.  Increased Throughput by Parallelization of Library Preparation for Massive Sequencing , 2010, PloS one.

[21]  D. Bartel,et al.  Formation, Regulation and Evolution of Caenorhabditis elegans 3′UTRs , 2010, Nature.

[22]  Comparison of total and cytoplasmic mRNA reveals global regulation by nuclear retention and miRNAs , 2012, BMC Genomics.

[23]  LiWei,et al.  Transcriptome assembly and isoform expression level estimation from biased RNA-Seq reads , 2012 .

[24]  M. Mann,et al.  Defining the transcriptome and proteome in three functionally different human cell lines , 2010, Molecular systems biology.

[25]  Tao Jiang,et al.  Transcriptome assembly and isoform expression level estimation from biased RNA-Seq reads , 2012, Bioinform..

[26]  Li Yang,et al.  Genomewide characterization of non-polyadenylated RNAs , 2011, Genome Biology.

[27]  Thomas Ragg,et al.  The RIN: an RNA integrity number for assigning integrity values to RNA measurements , 2006, BMC Molecular Biology.

[28]  Nadav S. Bar,et al.  Landscape of transcription in human cells , 2012, Nature.