Differential and coherent processing patterns from small RNAs

Post-transcriptional processing events related to short RNAs are often reflected in their read profile patterns emerging from high-throughput sequencing data. MicroRNA arm switching across different tissues is a well-known example of what we define as differential processing. Here, short RNAs from the nine cell lines of the ENCODE project, irrespective of their annotation status, were analyzed for genomic loci representing differential or coherent processing. We observed differential processing predominantly in RNAs annotated as miRNA, snoRNA or tRNA. Four out of five known cases of differentially processed miRNAs that were in the input dataset were recovered and several novel cases were discovered. In contrast to differential processing, coherent processing is observed widespread in both annotated and unannotated regions. While the annotated loci predominantly consist of ~24nt short RNAs, the unannotated loci comparatively consist of ~17nt short RNAs. Furthermore, these ~17nt short RNAs are significantly enriched for overlap to transcription start sites and DNase I hypersensitive sites (p-value < 0.01) that are characteristic features of transcription initiation RNAs. We discuss how the computational pipeline developed in this study has the potential to be applied to other forms of RNA-seq data for further transcriptome-wide studies of differential and coherent processing.

[1]  W. Birchmeier,et al.  Mechanisms Identified in the Transcriptional Control of Epithelial Gene Expression (*) , 1996, The Journal of Biological Chemistry.

[2]  L. Lim,et al.  An Abundant Class of Tiny RNAs with Probable Regulatory Roles in Caenorhabditis elegans , 2001, Science.

[3]  Philip Lijnzaad,et al.  The Ensembl genome database project , 2002, Nucleic Acids Res..

[4]  J. Castle,et al.  Genome-Wide Survey of Human Alternative Pre-mRNA Splicing with Exon Junction Microarrays , 2003, Science.

[5]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[6]  Terrence S. Furey,et al.  The UCSC Table Browser data retrieval tool , 2004, Nucleic Acids Res..

[7]  F. Clark,et al.  Understanding alternative splicing: towards a cellular code , 2005, Nature Reviews Molecular Cell Biology.

[8]  Hidetoshi Shimodaira,et al.  Pvclust: an R package for assessing the uncertainty in hierarchical clustering , 2006, Bioinform..

[9]  Wei Yan,et al.  Tissue-dependent paired expression of miRNAs , 2007, Nucleic acids research.

[10]  H. Shigeishi,et al.  Snail-Induced Down-Regulation of ΔNp63α Acquires Invasive Phenotype of Human Squamous Cell Carcinoma , 2007 .

[11]  O. Urakawa,et al.  Small - , 2007 .

[12]  J. Mattick,et al.  The relationship between non-protein-coding DNA and eukaryotic complexity. , 2007, BioEssays : news and reviews in molecular, cellular and developmental biology.

[13]  H. Shigeishi,et al.  Snail-induced down-regulation of DeltaNp63alpha acquires invasive phenotype of human squamous cell carcinoma. , 2007, Cancer research.

[14]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[15]  N. Rajewsky,et al.  A human snoRNA with microRNA-like functions. , 2008, Molecular cell.

[16]  P. Kuo,et al.  RNA stability regulates differential expression of the metastasis protein, osteopontin, in hepatocellular cancer. , 2008, Surgery.

[17]  M. Stephens,et al.  RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. , 2008, Genome research.

[18]  Juliane C. Dohm,et al.  Substantial biases in ultra-short read data sets from high-throughput DNA sequencing , 2008, Nucleic acids research.

[19]  Pamela J Green,et al.  tRNA cleavage is a conserved response to oxidative stress in eukaryotes. , 2008, RNA.

[20]  Gene W. Yeo,et al.  Divergent Transcription from Active Promoters , 2008, Science.

[21]  W. L. Ruzzo,et al.  Comparative genomics beyond sequence-based alignments: RNA structures in the ENCODE regions. , 2008, Genome research.

[22]  Kristin Reiche,et al.  Structural profiles of human miRNA families from pairwise clustering , 2009, Bioinform..

[23]  D. Bartel MicroRNAs: Target Recognition and Regulatory Functions , 2009, Cell.

[24]  Dereje D. Jima,et al.  Patterns of microRNA expression characterize stages of human B-cell differentiation. , 2009, Blood.

[25]  Peter F. Stadler,et al.  Fast Mapping of Short Sequences with Mismatches, Insertions and Deletions Using Index Structures , 2009, PLoS Comput. Biol..

[26]  M. Joglekar,et al.  The miR-30 family microRNAs confer epithelial phenotype to human pancreatic cells , 2009, Islets.

[27]  R. Sachidanandam,et al.  Post-transcriptional processing generates a diversity of 5′-modified long and short RNAs , 2009, Nature.

[28]  Peter F. Stadler,et al.  Evidence for human microRNA-offset RNAs in small RNA sequencing data , 2009, Bioinform..

[29]  Patricia P. Chan,et al.  GtRNAdb: a database of transfer RNA genes detected in genomic sequence , 2008, Nucleic Acids Res..

[30]  J. Kawai,et al.  Tiny RNAs associated with transcription start sites in animals , 2009, Nature Genetics.

[31]  J. Mattick,et al.  Small RNAs derived from snoRNAs. , 2009, RNA.

[32]  G. Barton,et al.  Filtering of deep sequencing data reveals the existence of abundant Dicer-dependent small RNAs derived from tRNAs. , 2009, RNA.

[33]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[34]  John S Mattick,et al.  Identification of novel non-coding RNAs using profiles of short sequence reads from next generation sequencing data , 2010, BMC Genomics.

[35]  A. Malhotra,et al.  A novel class of small RNAs: tRNA-derived RNA fragments (tRFs). , 2009, Genes & development.

[36]  Wen-Hsiung Li,et al.  Uncovering Small RNA-Mediated Responses to Phosphate Deficiency in Arabidopsis by Deep Sequencing1[W][OA] , 2009, Plant Physiology.

[37]  D. Haussecker,et al.  Human tRNA-derived small RNAs in the global regulation of RNA silencing. , 2010, RNA.

[38]  Supratim Choudhuri,et al.  Small noncoding RNAs: Biogenesis, function, and emerging significance in toxicology , 2010, Journal of biochemical and molecular toxicology.

[39]  Dereje D. Jima,et al.  Deep sequencing of the small RNA transcriptome of normal and malignant human B cells identifies hundreds of novel microRNAs. , 2010, Blood.

[40]  Peter F. Stadler,et al.  Identification and Classification of Small RNAs in Transcriptome Sequence Data , 2010, Pacific Symposium on Biocomputing.

[41]  W. Huber,et al.  Differential expression analysis for sequence count data , 2010 .

[42]  K. Hansen,et al.  Biases in Illumina transcriptome sequencing caused by random hexamer priming , 2010, Nucleic acids research.

[43]  Mihaela Zavolan,et al.  The snoRNA MBII-52 (SNORD 115) is processed into smaller RNAs and regulates alternative splicing. , 2010, Human molecular genetics.

[44]  C. Nusbaum,et al.  Mammalian microRNAs: experimental evaluation of novel and previously annotated genes. , 2010, Genes & development.

[45]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[46]  Ralf Zimmer,et al.  Classification of ncRNAs using position and size information in deep sequencing data , 2010, Bioinform..

[47]  Martin Löwer,et al.  Digital Genome-Wide ncRNA Expression, Including SnoRNAs, across 11 Human Tissues Using PolyA-Neutral Amplification , 2010, PloS one.

[48]  Steve Hoffmann,et al.  Traces of post-transcriptional RNA modifications in deep sequencing data , 2011, Biological chemistry.

[49]  Dennis B. Troup,et al.  NCBI GEO: archive for functional genomics data sets—10 years on , 2010, Nucleic Acids Res..

[50]  Sam Griffiths-Jones,et al.  MicroRNA evolution by arm switching , 2011, EMBO reports.

[51]  Manolis Kellis,et al.  Discovery and Characterization of Chromatin States for Systematic Annotation of the Human Genome , 2011, RECOMB.

[52]  Neil R Smalheiser,et al.  Endogenous siRNAs and noncoding RNA-derived small RNAs are expressed in adult mouse hippocampus and are up-regulated in olfactory discrimination training. , 2011, RNA.

[53]  Ryan D. Morin,et al.  Comprehensive analysis of mammalian miRNA* species and their role in myeloid cells. , 2011, Blood.

[54]  David L. A. Wood,et al.  MicroRNAs and their isomiRs function cooperatively to target common biological pathways , 2011, Genome Biology.

[55]  Chi-Ying F. Huang,et al.  miRTarBase: a database curates experimentally validated microRNA–target interactions , 2010, Nucleic Acids Res..

[56]  Ana Kozomara,et al.  miRBase: integrating microRNA annotation and deep-sequencing data , 2010, Nucleic Acids Res..

[57]  Ibrahim Emam,et al.  ArrayExpress update—an archive of microarray and high-throughput sequencing-based functional genomics experiments , 2010, Nucleic Acids Res..

[58]  Albin Sandelin,et al.  Biogenic mechanisms and utilization of small RNAs derived from human protein-coding genes , 2011, Nature Structural &Molecular Biology.

[59]  Markus Brameier,et al.  Human box C/D snoRNAs with miRNA like functions: expanding the range of regulatory RNAs , 2010, Nucleic Acids Res..

[60]  S. Letovsky,et al.  Protocol Dependence of Sequencing-Based Gene Expression Measurements , 2011, PloS one.

[61]  Wei Zheng,et al.  Bias detection and correction in RNA-Sequencing data , 2011, BMC Bioinformatics.

[62]  Robert D. Finn,et al.  Rfam: Wikipedia, clans and the “decimal” release , 2010, Nucleic Acids Res..

[63]  Raymond K. Auerbach,et al.  A User's Guide to the Encyclopedia of DNA Elements (ENCODE) , 2011, PLoS biology.

[64]  Y. Benjamini,et al.  Summarizing and correcting the GC content bias in high-throughput sequencing , 2012, Nucleic acids research.

[65]  Michelle S. Scott,et al.  Human box C/D snoRNA processing conservation across multiple cell types , 2011, Nucleic acids research.

[66]  Sebastian D. Mackowiak,et al.  miRDeep2 accurately identifies known and hundreds of novel microRNA genes in seven animal clades , 2011, Nucleic acids research.

[67]  David G. Knowles,et al.  The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression , 2012, Genome research.

[68]  Xinchen Wang,et al.  Tissue-specific alternative splicing remodels protein-protein interaction networks. , 2012, Molecular cell.

[69]  Claus Thorn Ekstrøm,et al.  deepBlockAlign: a tool for aligning RNA-seq profiles of read block patterns , 2011, Bioinform..

[70]  Nathan C. Sheffield,et al.  The accessible chromatin landscape of the human genome , 2012, Nature.

[71]  A. Krainer,et al.  Manipulation of PK-M mutually exclusive alternative splicing by antisense oligonucleotides , 2012, Open Biology.

[72]  Chun-Hung Lai,et al.  miRNA arm selection and isomiR distribution in gastric cancer , 2012, BMC Genomics.

[73]  Yuan Chang,et al.  Extensive terminal and asymmetric processing of small RNAs from rRNAs, snoRNAs, snRNAs, and tRNAs , 2012, Nucleic acids research.

[74]  Walter L. Ruzzo,et al.  A new approach to bias correction in RNA-Seq , 2012, Bioinform..

[75]  William Stafford Noble,et al.  Unsupervised pattern discovery in human chromatin structure through genomic segmentation , 2012, Nature Methods.

[76]  Wen-Hsiung Li,et al.  MicroRNA 3' end nucleotide modification patterns and arm selection preference in liver tissues , 2012, BMC Systems Biology.

[77]  Uwe Ohler,et al.  High-resolution experimental and computational profiling of tissue-specific known and novel miRNAs in Arabidopsis. , 2012, Genome research.

[78]  Martin Renqiang Min,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[79]  S. Kaufmann,et al.  Ultrashort and progressive 4sU-tagging reveals key characteristics of RNA processing at nucleotide resolution , 2012, Genome research.

[80]  Kenneth S. Kosik,et al.  Deep annotation of mouse iso-miR and iso-moR variation , 2012, Nucleic acids research.

[81]  Sean R. Davis,et al.  NCBI GEO: archive for functional genomics data sets—update , 2012, Nucleic Acids Res..

[82]  Yoko Ito,et al.  Imprinted chromatin around DIRAS3 regulates alternative splicing of GNG12-AS1, a long noncoding RNA. , 2013, American journal of human genetics.

[83]  C. Nelson,et al.  miRDeep*: an integrated application tool for miRNA identification from RNA sequencing data , 2012, Nucleic acids research.

[84]  Jie Wang,et al.  Unsupervised pattern discovery in human chromatin structure through genomic segmentation , 2013, BCB.

[85]  L. Steinmetz,et al.  Polyadenylation site–induced decay of upstream transcripts enforces promoter directionality , 2013, Nature Structural &Molecular Biology.

[86]  Stefan Stamm,et al.  Processing of snoRNAs as a new source of regulatory non‐coding RNAs , 2013, BioEssays : news and reviews in molecular, cellular and developmental biology.

[87]  William Stafford Noble,et al.  Integrative annotation of chromatin elements from ENCODE data , 2012, Nucleic acids research.

[88]  Jeroen F. J. Laros,et al.  Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories , 2013, Nature Biotechnology.

[89]  M. Waters,et al.  Differential expression of long noncoding RNAs in the livers of female B6C3F1 mice exposed to the carcinogen furan. , 2013, Toxicological sciences : an official journal of the Society of Toxicology.

[90]  Cuiping Mao,et al.  Differential expression of long non-coding RNAs in bleomycin-induced lung fibrosis. , 2013, International journal of molecular medicine.

[91]  Phillip A. Sharp,et al.  Argonaute-Bound Small RNAs from Promoter-Proximal RNA Polymerase II , 2014, Cell.

[92]  André L. Martins,et al.  Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers , 2014, Nature Genetics.