Tailor: a computational framework for detecting non-templated tailing of small silencing RNAs

Small silencing RNAs, including microRNAs, endogenous small interfering RNAs (endo-siRNAs) and Piwi-interacting RNAs (piRNAs), have been shown to play important roles in fine-tuning gene expression, defending virus and controlling transposons. Loss of small silencing RNAs or components in their pathways often leads to severe developmental defects, including lethality and sterility. Recently, non-templated addition of nucleotides to the 3′ end, namely tailing, was found to associate with the processing and stability of small silencing RNAs. Next Generation Sequencing has made it possible to detect such modifications at nucleotide resolution in an unprecedented throughput. Unfortunately, detecting such events from millions of short reads confounded by sequencing errors and RNA editing is still a tricky problem. Here, we developed a computational framework, Tailor, driven by an efficient and accurate aligner specifically designed for capturing the tailing events directly from the alignments without extensive post-processing. The performance of Tailor was fully tested and compared favorably with other general-purpose aligners using both simulated and real datasets for tailing analysis. Moreover, to show the broad utility of Tailor, we used Tailor to reanalyze published datasets and revealed novel findings worth further experimental validation. The source code and the executable binaries are freely available at https://github.com/jhhung/Tailor.

[1]  R. Durbin,et al.  Mapping Quality Scores Mapping Short Dna Sequencing Reads and Calling Variants Using P

, 2022 .

[2]  Pedro J. Batista,et al.  CDE-1 Affects Chromosome Segregation through Uridylation of CSR-1-Bound siRNAs , 2009, Cell.

[3]  D. J. Wheeler,et al.  A Block-sorting Lossless Data Compression Algorithm , 1994 .

[4]  Xuemei Chen,et al.  Uridylation of miRNAs by HEN1 SUPPRESSOR1 in Arabidopsis , 2012, Current Biology.

[5]  Zhiping Weng,et al.  Target RNA–Directed Trimming and Tailing of Small Silencing RNAs , 2010, Science.

[6]  C. Joo,et al.  Lin28 mediates the terminal uridylation of let-7 precursor MicroRNA. , 2008, Molecular cell.

[7]  S. Gerstberger,et al.  A census of human RNA-binding proteins , 2014, Nature Reviews Genetics.

[8]  Marcel H. Schulz,et al.  Probabilistic error correction for RNA sequencing , 2013, Nucleic acids research.

[9]  T. Matise,et al.  Widespread RNA editing of embedded alu elements in the human transcriptome. , 2004, Genome research.

[10]  Fritz J Sedlazeck,et al.  Adenosine deaminases that act on RNA induce reproducible changes in abundance and sequence of embryonic miRNAs , 2012, Genome research.

[11]  B. Bass,et al.  RNA hairpins in noncoding regions of human brain and Caenorhabditis elegans mRNA are edited by adenosine deaminases that act on RNA , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Hyeshik Chang,et al.  Mono-Uridylation of Pre-MicroRNA as a Key Step in the Biogenesis of Group II let-7 MicroRNAs , 2012, Cell.

[13]  Heng Li,et al.  A survey of sequence alignment algorithms for next-generation sequencing , 2010, Briefings Bioinform..

[14]  Annick Harel-Bellan,et al.  Argonaute proteins couple chromatin silencing to alternative splicing , 2012, Nature Structural &Molecular Biology.

[15]  M. Crochemore,et al.  On-line construction of suffix trees , 2002 .

[16]  A. von Haeseler,et al.  ADAR2 induces reproducible changes in sequence and abundance of mature microRNAs in the mouse brain , 2014, Nucleic acids research.

[17]  Anton J. Enright,et al.  RNA editing of human microRNAs , 2006, Genome Biology.

[18]  Stefan L Ameres,et al.  Diversifying microRNA sequence and function , 2013, Nature Reviews Molecular Cell Biology.

[19]  Stefan L Ameres,et al.  Target RNA-directed tailing and trimming purifies the sorting of endo-siRNAs between the two Drosophila Argonaute proteins. , 2011, RNA.

[20]  Robert A. Martienssen,et al.  RNA interference in the nucleus: roles for small RNAs in transcription, epigenetics and beyond , 2013, Nature Reviews Genetics.

[21]  Giovanni Manzini,et al.  Opportunistic data structures with applications , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[22]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[23]  Xuemei Chen,et al.  The Arabidopsis Nucleotidyl Transferase HESO1 Uridylates Unmethylated Small RNAs to Trigger Their Degradation , 2012, Current Biology.

[24]  Sam Griffiths-Jones,et al.  Detection of microRNAs in color space , 2011, Bioinformatics.

[25]  Ruiqiang Li,et al.  SOAP: short oligonucleotide alignment program , 2008, Bioinform..

[26]  P. Jin,et al.  AGO3 Slicer activity regulates mitochondria–nuage localization of Armitage and piRNA amplification , 2014, The Journal of cell biology.

[27]  Kimihiro Hino,et al.  A-to-I editing in the miRNA seed region regulates target mRNA selection and silencing efficiency , 2014, Nucleic acids research.

[28]  Stefan L Ameres,et al.  Long-term, efficient inhibition of microRNA function in mice using rAAV vectors , 2012, Nature Methods.

[29]  Juliane C. Dohm,et al.  Substantial biases in ultra-short read data sets from high-throughput DNA sequencing , 2008, Nucleic acids research.

[30]  Juha Kärkkäinen,et al.  Fast Lightweight Suffix Array Construction and Checking , 2003, CPM.

[31]  Richard Wooster,et al.  A survey of RNA editing in human brain. , 2004, Genome research.

[32]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[33]  S. Morishita,et al.  Efficient frequency-based de novo short-read clustering for error trimming in next-generation sequencing. , 2009, Genome research.

[34]  Lukas Wagner,et al.  A Greedy Algorithm for Aligning DNA Sequences , 2000, J. Comput. Biol..

[35]  Juha Kärkkäinen,et al.  Fast BWT in small space by blockwise suffix sorting , 2007, Theor. Comput. Sci..

[36]  H. Kaessmann,et al.  Conserved microRNA editing in mammalian evolution, development and disease , 2014, Genome Biology.

[37]  W. Raub From the National Institutes of Health. , 1990, JAMA.

[38]  R. Houlston,et al.  Generation of Artificial FASTQ Files to Evaluate the Performance of Next-Generation Sequencing Pipelines , 2012, PloS one.

[39]  Juliane C. Dohm,et al.  Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and Genome Analyzer systems , 2011, Genome Biology.

[40]  Aaron R. Quinlan,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2022 .

[41]  René F. Ketting,et al.  PIWI-interacting RNAs: from generation to transgenerational epigenetics , 2013, Nature Reviews Genetics.

[42]  C. Joo,et al.  TUT4 in Concert with Lin28 Suppresses MicroRNA Biogenesis through Pre-MicroRNA Uridylation , 2009, Cell.

[43]  Veerle Fack,et al.  Prospects and limitations of full-text index structures in genome analysis , 2012, Nucleic acids research.

[44]  P. Zamore,et al.  Small silencing RNAs: an expanding universe , 2009, Nature Reviews Genetics.

[45]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[46]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[47]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[48]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[49]  Xuemei Chen,et al.  Methylation Protects miRNAs and siRNAs from a 3′-End Uridylation Activity in Arabidopsis , 2005, Current Biology.

[50]  Eugene Berezikov,et al.  microRNAs associated with the different human Argonaute proteins , 2012, Nucleic acids research.

[51]  Henry Mirsky,et al.  RNA editing of a miRNA precursor. , 2004, RNA.

[52]  N. Friedman,et al.  Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data , 2011, Nature Biotechnology.