Quantifying circular RNA expression from RNA‐seq data using model‐based framework

Motivation: Circular RNAs (circRNAs) are a class of non‐coding RNAs that are widely expressed in various cell lines and tissues of many organisms. Although the exact function of many circRNAs is largely unknown, the cell type—and tissue‐specific circRNA expression has implicated their crucial functions in many biological processes. Hence, the quantification of circRNA expression from high‐throughput RNA‐seq data is becoming important to ascertain. Although many model‐based methods have been developed to quantify linear RNA expression from RNA‐seq data, these methods are not applicable to circRNA quantification. Results: Here, we proposed a novel strategy that transforms circular transcripts to pseudo‐linear transcripts and estimates the expression values of both circular and linear transcripts using an existing model‐based algorithm, Sailfish. The new strategy can accurately estimate transcript expression of both linear and circular transcripts from RNA‐seq data. Several factors, such as gene length, amount of expression and the ratio of circular to linear transcripts, had impacts on quantification performance of circular transcripts. In comparison to count‐based tools, the new computational framework had superior performance in estimating the amount of circRNA expression from both simulated and real ribosomal RNA‐depleted (rRNA‐depleted) RNA‐seq datasets. On the other hand, the consideration of circular transcripts in expression quantification from rRNA‐depleted RNA‐seq data showed substantial increased accuracy of linear transcript expression. Our proposed strategy was implemented in a program named Sailfish‐cir. Availability and Implementation: Sailfish‐cir is freely available at https://github.com/zerodel/Sailfish‐cir. Contact: tongz@medicine.nevada.edu or wanjun.gu@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Petar Glažar,et al.  circBase: a database for circular RNAs , 2014, RNA.

[2]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[3]  D. Bartel,et al.  Expanded identification and characterization of mammalian circular RNAs , 2014, Genome Biology.

[4]  J. Kjems,et al.  Natural RNA circles function as efficient microRNA sponges , 2013, Nature.

[5]  Ling-Ling Chen,et al.  Complementary Sequence-Mediated Exon Circularization , 2014, Cell.

[6]  Sol Shenker,et al.  Genome-wide analysis of drosophila circular RNAs reveals their structural and sequence properties and age-dependent neural accumulation. , 2014, Cell reports.

[7]  Rob Patro,et al.  Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms , 2013, Nature Biotechnology.

[8]  Masao Nagasaki,et al.  TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads , 2014, BMC Genomics.

[9]  P. Brown,et al.  Circular RNA Is Expressed across the Eukaryotic Tree of Life , 2014, PloS one.

[10]  Sandrine Dudoit,et al.  Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments , 2010, BMC Bioinformatics.

[11]  Dmitri D. Pervouchine,et al.  A benchmark for RNA-seq quantification pipelines , 2016, Genome Biology.

[12]  Tao Jiang,et al.  Transcriptome assembly and isoform expression level estimation from biased RNA-Seq reads , 2012, Bioinform..

[13]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[14]  Sean R. Davis,et al.  NCBI GEO: archive for functional genomics data sets—update , 2012, Nucleic Acids Res..

[15]  G. Shan,et al.  Exon-intron circular RNAs regulate transcription in the nucleus , 2015, Nature Structural &Molecular Biology.

[16]  Robert Patro,et al.  Sailfish: Alignment-free Isoform Quantification from RNA-seq Reads using Lightweight Algorithms , 2013, ArXiv.

[17]  Charles Gawad,et al.  Circular RNAs Are the Predominant Transcript Isoform from Hundreds of Human Genes in Diverse Cell Types , 2012, PloS one.

[18]  T. Sixma,et al.  Make them, break them, and catch them: studying rare ubiquitin chains. , 2015, Molecular cell.

[19]  Lior Pachter,et al.  Near-optimal probabilistic RNA-seq quantification , 2016, Nature Biotechnology.

[20]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[21]  Mihaela Zavolan,et al.  Comparative assessment of methods for the computational inference of transcript isoform abundance from RNA-seq data , 2015, Genome Biology.

[22]  Steven P Gygi,et al.  The Sestrins interact with GATOR2 to negatively regulate the amino-acid-sensing pathway upstream of mTORC1. , 2014, Cell reports.

[23]  Linda Szabo,et al.  Statistically based splicing detection reveals neural enrichment and tissue-specific induction of circular RNA during human fetal development , 2015, Genome Biology.

[24]  Petar Glažar,et al.  Circular RNAs in the Mammalian Brain Are Highly Abundant, Conserved, and Dynamically Expressed. , 2015, Molecular cell.

[25]  Yifeng Zhou,et al.  Circular RNA ITCH has inhibitory effect on ESCC by suppressing the Wnt/β-catenin pathway , 2015, Oncotarget.

[26]  Jeff H. Chang,et al.  The NBP Negative Binomial Model for Assessing Differential Gene Expression from RNA-Seq , 2011 .

[27]  A. Conesa,et al.  Differential expression in RNA-seq: a matter of depth. , 2011, Genome research.

[28]  Michael K. Slevin,et al.  Circular RNAs are abundant, conserved, and associated with ALU repeats. , 2013, RNA.

[29]  C. Cocquerelle,et al.  Mis‐splicing yields circular RNA molecules , 1993, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[30]  J. Kjems,et al.  Spatio-temporal regulation of circular RNA expression during porcine embryonic brain development , 2015, Genome Biology.

[31]  Tim Schneider,et al.  Exon circularization requires canonical splice signals. , 2015, Cell reports.

[32]  Jie Wu,et al.  deepBase v2.0: identification, expression, evolution and function of small RNAs, LncRNAs and circular RNAs from deep-sequencing data , 2015, Nucleic Acids Res..

[33]  Trees-Juen Chuang,et al.  NCLscan: accurate identification of non-co-linear transcripts (fusion, trans-splicing and circular RNA) with a good balance between sensitivity and precision , 2015, Nucleic acids research.

[34]  Hongshan Guo,et al.  Single-cell RNA-seq transcriptome analysis of linear and circular RNAs in mouse preimplantation embryos , 2015, Genome Biology.

[35]  Kai Wang,et al.  Circular RNA profile in gliomas revealed by identification tool UROBORUS , 2016, Nucleic acids research.

[36]  R. Zeillinger,et al.  Correlation of circular RNA abundance with proliferation – exemplified with colorectal and ovarian cancer, idiopathic lung fibrosis, and normal human tissues , 2015, Scientific Reports.

[37]  N. Sharpless,et al.  Detecting and characterizing circular RNAs , 2014, Nature Biotechnology.

[38]  Haimin Li,et al.  Circular RNA: A new star of noncoding RNAs. , 2015, Cancer letters.

[39]  Ning Leng,et al.  EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments , 2013, Bioinform..

[40]  Alyssa C. Frazee,et al.  Polyester: Simulating RNA-Seq Datasets With Differential Transcript Expression , 2014, bioRxiv.

[41]  Colin N. Dewey,et al.  RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome , 2011, BMC Bioinformatics.

[42]  F. Zhao,et al.  CIRI: an efficient and unbiased algorithm for de novo circular RNA identification , 2015, Genome Biology.

[43]  Thomas J. Hardcastle,et al.  baySeq: Empirical Bayesian methods for identifying differential expression in sequence count data , 2010, BMC Bioinformatics.

[44]  J. Kjems,et al.  Comparison of circular RNA prediction tools , 2015, Nucleic acids research.

[45]  Ning Leng,et al.  EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments , 2013, Bioinform..

[46]  Sheng Li,et al.  An optimized algorithm for detecting and annotating regional differential methylation , 2013, BMC Bioinformatics.

[47]  E. Schuman,et al.  Neural circular RNAs are derived from synaptic genes and regulated by development and plasticity , 2015, Nature Neuroscience.

[48]  N. Rajewsky,et al.  circRNA biogenesis competes with pre-mRNA splicing. , 2014, Molecular cell.

[49]  Hsien-Da Huang,et al.  CircNet: a database of circular RNAs derived from transcriptome sequencing data , 2015, Nucleic Acids Res..

[50]  William R. Jeck,et al.  Expression of Linear and Novel Circular Forms of an INK4/ARF-Associated Non-Coding RNA Correlates with Atherosclerosis Risk , 2010, PLoS genetics.

[51]  Charlotte Soneson,et al.  A comparison of methods for differential expression analysis of RNA-seq data , 2013, BMC Bioinformatics.

[52]  Antti Honkela,et al.  Identifying differentially expressed transcripts from RNA-seq data with biological variation , 2011, Bioinform..

[53]  Sebastian D. Mackowiak,et al.  Circular RNAs are a large class of animal RNAs with regulatory potency , 2013, Nature.

[54]  Julia Salzman,et al.  Circular RNA biogenesis can proceed through an exon-containing lariat precursor , 2015, eLife.

[55]  Avi Srivastava,et al.  RapMap: A Rapid, Sensitive and Accurate Tool for Mapping RNA-seq Reads to Transcriptomes , 2015 .

[56]  Qian-Hao Zhu,et al.  Widespread noncoding circular RNAs in plants. , 2015, The New phytologist.

[57]  D. Riesner,et al.  Viroids are single-stranded covalently closed circular RNA molecules existing as highly base-paired rod-like structures. , 1976, Proceedings of the National Academy of Sciences of the United States of America.

[58]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[59]  Jun Cheng,et al.  Specific identification and quantification of circular RNAs from sequencing data , 2016, Bioinform..

[60]  L. Pachter,et al.  Streaming fragment assignment for real-time analysis of sequencing experiments , 2012, Nature Methods.