The discovery potential of RNA processing profiles

Small non-coding RNAs are highly abundant molecules that regulate essential cellular processes and are classified according to sequence and structure. Here we argue that read profiles from size-selected RNA sequencing capture the post-transcriptional processing specific to each RNA family, thereby providing functional information independently of sequence and structure. We developed SeRPeNT, the first unsupervised computational method that exploits reproducibility across replicates and uses dynamic time-warping and density-based clustering algorithms to identify, characterize and compare small non-coding RNAs (sncRNAs) by harnessing the power of read profiles. We applied SeRPeNT to: a) generate an extended human annotation with 671 new sncRNAs from known classes and 131 from new potential classes, b) show pervasive differential processing between cell compartments and c) predict new molecules with miRNA-like behaviour from snoRNA, tRNA and long non-coding RNA precursors, potentially dependent on the miRNA biogenesis pathway. Furthermore, we validated experimentally four predicted novel non-coding RNAs: a miRNA, a snoRNA-derived miRNA, a processed tRNA and a new uncharacterized sncRNA. SeRPeNT facilitates fast and accurate discovery and characterization of small non-coding RNAs at unprecedented scale. SeRPeNT code is available under the MIT license at https://github.com/comprna/SeRPeNT.

[1]  T. Therneau,et al.  SERE: Single-parameter quality control and sample comparison for RNA-Seq , 2012, BMC Genomics.

[2]  Eduardo Eyras,et al.  DGCR8 HITS-CLIP reveals novel functions for the Microprocessor , 2012, Nature Structural &Molecular Biology.

[3]  Sarah M Assmann,et al.  Genome-wide profiling of in vivo RNA structure at single-nucleotide resolution using structure-seq , 2015, Nature Protocols.

[4]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[5]  K. Morris,et al.  The rise of regulatory RNA , 2014, Nature Reviews Genetics.

[6]  R. Tsutsumi,et al.  tRFs: miRNAs in disguise. , 2016, Gene.

[7]  Bronwen L. Aken,et al.  GENCODE: The reference human genome annotation for The ENCODE Project , 2012, Genome research.

[8]  Robert D. Finn,et al.  Rfam 12.0: updates to the RNA families database , 2014, Nucleic Acids Res..

[9]  V. Beneš,et al.  Df31 protein and snoRNAs maintain accessible higher-order structures of chromatin. , 2012, Molecular cell.

[10]  William Ritchie,et al.  RNA stem-loops: to be or not to be cleaved by RNAse III. , 2007, RNA.

[11]  A. Malhotra,et al.  A novel class of small RNAs: tRNA-derived RNA fragments (tRFs). , 2009, Genes & development.

[12]  Shuliang Wang,et al.  Clustering by Fast Search and Find of Density Peaks with Data Field , 2016 .

[13]  A. Tramontano,et al.  Novel Long Noncoding RNAs (lncRNAs) in Myogenesis: a miR-31 Overlapping lncRNA Transcript Controls Myoblast Differentiation , 2014, Molecular and Cellular Biology.

[14]  Rolf Backofen,et al.  BlockClust: efficient clustering and classification of non-coding RNAs from short read RNA-seq profiles , 2014, GCB.

[15]  Sampath Kannan,et al.  DASHR: database of small human noncoding RNAs , 2015, Nucleic Acids Res..

[16]  Sebastian D. Mackowiak,et al.  miRDeep2 accurately identifies known and hundreds of novel microRNA genes in seven animal clades , 2011, Nucleic acids research.

[17]  V. Kim,et al.  Regulation of microRNA biogenesis , 2014, Nature Reviews Molecular Cell Biology.

[18]  J. Valcárcel,et al.  Argonaute-1 binds transcriptional enhancers and controls constitutive and alternative splicing in human cells , 2014, Proceedings of the National Academy of Sciences.

[19]  Sean Hughes,et al.  Clustering by Fast Search and Find of Density Peaks , 2016 .

[20]  P. Kapranov,et al.  The Landscape of long noncoding RNA classification. , 2015, Trends in genetics : TIG.

[21]  Yuan Chang,et al.  Extensive terminal and asymmetric processing of small RNAs from rRNAs, snoRNAs, snRNAs, and tRNAs , 2012, Nucleic acids research.

[22]  Xavier Estivill,et al.  Evidence for the biogenesis of more than 1,000 novel human microRNAs , 2014, Genome Biology.

[23]  Cesare Furlanello,et al.  A promoter-level mammalian expression atlas , 2015 .

[24]  N. Rajewsky,et al.  A Variety of Dicer Substrates in Human and C. elegans , 2014, Cell.

[25]  Phillipe Loher,et al.  Sex hormone-dependent tRNA halves enhance cell proliferation in breast and prostate cancers , 2015, Proceedings of the National Academy of Sciences.

[26]  Sean R. Eddy,et al.  Infernal 1.1: 100-fold faster RNA homology searches , 2013, Bioinform..

[27]  Oliver J. Rando,et al.  Biogenesis and function of tRNA fragments during sperm maturation and fertilization in mammals , 2016, Science.

[28]  Yan Guo,et al.  Mining diverse small RNA species in the deep transcriptome. , 2015, Trends in biochemical sciences.

[29]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[30]  J. Gorodkin,et al.  Differential and coherent processing patterns from small RNAs , 2015, Scientific Reports.

[31]  V. Kim,et al.  Re-evaluation of the roles of DROSHA, Exportin 5, and DICER in microRNA biogenesis , 2016, Proceedings of the National Academy of Sciences.

[32]  Mihaela Zavolan,et al.  The snoRNA MBII-52 (SNORD 115) is processed into smaller RNAs and regulates alternative splicing. , 2010, Human molecular genetics.

[33]  David L. Spector,et al.  3′ End Processing of a Long Nuclear-Retained Noncoding RNA Yields a tRNA-like Cytoplasmic RNA , 2008, Cell.

[34]  Steven Busan,et al.  RNA motif discovery by SHAPE and mutational profiling (SHAPE-MaP) , 2014, Nature Methods.

[35]  R. Pandey,et al.  A legion of potential regulatory sRNAs exists beyond the typical microRNAs microcosm , 2015, Nucleic acids research.

[36]  Piotr Kozlowski,et al.  Structural basis of microRNA length variety , 2010, Nucleic Acids Res..

[37]  Sam Griffiths-Jones,et al.  MicroRNA evolution by arm switching , 2011, EMBO reports.

[38]  G. Barton,et al.  Filtering of deep sequencing data reveals the existence of abundant Dicer-dependent small RNAs derived from tRNAs. , 2009, RNA.

[39]  N. Rajewsky,et al.  A human snoRNA with microRNA-like functions. , 2008, Molecular cell.

[40]  Xudong Zhang,et al.  Sperm tsRNAs contribute to intergenerational inheritance of an acquired metabolic disorder , 2016, Science.

[41]  Raymond K. Auerbach,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[42]  Yizong Cheng,et al.  Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Z. Ignatova,et al.  Emerging roles of tRNA in adaptive translation, signalling dynamics and disease , 2014, Nature Reviews Genetics.

[44]  I. Bozzoni,et al.  Identification of linc-NeD125, a novel long non coding RNA that hosts miR-125b-1 and negatively controls proliferation of human neuroblastoma cells , 2015, RNA biology.

[45]  Joseph B. Kruskall,et al.  The Symmetric Time-Warping Problem : From Continuous to Discrete , 1983 .

[46]  Peter F. Stadler,et al.  Evidence for human microRNA-offset RNAs in small RNA sequencing data , 2009, Bioinform..

[47]  Chong-Jian Chen,et al.  Small RNAs derived from structural non-coding RNAs. , 2013, Methods.

[48]  Yi Jing,et al.  Dissecting tRNA-derived fragment complexities using personalized transcriptomes reveals novel fragment classes and unexpected dependencies , 2015, Oncotarget.

[49]  David Sankoff,et al.  Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .

[50]  Wen-Hsiung Li,et al.  MicroRNA 3' end nucleotide modification patterns and arm selection preference in liver tissues , 2012, BMC Systems Biology.

[51]  Peter F. Stadler,et al.  DARIO: a ncRNA detection and analysis tool for next-generation sequencing experiments , 2011, Nucleic Acids Res..

[52]  Markus Brameier,et al.  Human box C/D snoRNAs with miRNA like functions: expanding the range of regulatory RNAs , 2010, Nucleic Acids Res..

[53]  Ming Chen,et al.  MicroRNA Prediction Using a Fixed-Order Markov Model Based on the Secondary Structure Pattern , 2012, PloS one.

[54]  Mihaela Zavolan,et al.  Insights into snoRNA biogenesis and processing from PAR-CLIP of snoRNA core proteins and small RNA sequencing , 2013, Genome Biology.

[55]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[56]  Shaojie Zhang,et al.  Computational analysis of RNA structures with chemical probing data. , 2015, Methods.