The UEA Small RNA Workbench: A Suite of Computational Tools for Small RNA Analysis.

RNA silencing (RNA interference, RNAi) is a complex, highly conserved mechanism mediated by short, typically 20-24 nt in length, noncoding RNAs known as small RNAs (sRNAs). They act as guides for the sequence-specific transcriptional and posttranscriptional regulation of target mRNAs and play a key role in the fine-tuning of biological processes such as growth, response to stresses, or defense mechanism.High-throughput sequencing (HTS) technologies are employed to capture the expression levels of sRNA populations. The processing of the resulting big data sets facilitated the computational analysis of the sRNA patterns of variation within biological samples such as time point experiments, tissue series or various treatments. Rapid technological advances enable larger experiments, often with biological replicates leading to a vast amount of raw data. As a result, in this fast-evolving field, the existing methods for sequence characterization and prediction of interaction (regulatory) networks periodically require adapting or in extreme cases, a complete redesign to cope with the data deluge. In addition, the presence of numerous tools focused only on particular steps of HTS analysis hinders the systematic parsing of the results and their interpretation.The UEA small RNA Workbench (v1-4), described in this chapter, provides a user-friendly, modular, interactive analysis in the form of a suite of computational tools designed to process and mine sRNA datasets for interesting characteristics that can be linked back to the observed phenotypes. First, we show how to preprocess the raw sequencing output and prepare it for downstream analysis. Then we review some quality checks that can be used as a first indication of sources of variability between samples. Next we show how the Workbench can provide a comparison of the effects of different normalization approaches on the distributions of expression, enhanced methods for the identification of differentially expressed transcripts and a summary of their corresponding patterns. Finally we describe individual analysis tools such as PAREsnip, for the analysis of PARE (degradome) data or CoLIde for the identification of sRNA loci based on their expression patterns and the visualization of the results using the software. We illustrate the features of the UEA sRNA Workbench on Arabidopsis thaliana and Homo sapiens datasets.

[1]  Anton J. Enright,et al.  Integrated analysis of microRNA and mRNA expression and association with HIF binding reveals the complexity of microRNA expression regulation under hypoxia , 2014, Molecular Cancer.

[2]  V. Kim,et al.  Regulation of microRNA biogenesis , 2014, Nature Reviews Molecular Cell Biology.

[3]  M. Marra,et al.  Applications of next-generation sequencing technologies in functional genomics. , 2008, Genomics.

[4]  Joseph Foss,et al.  Comparing Methods of Clinical Measurement: Reporting Standards for Bland and Altman Analysis , 2000, Anesthesia and analgesia.

[5]  D. Bartel MicroRNAs: Target Recognition and Regulatory Functions , 2009, Cell.

[6]  V. Moulton,et al.  Profiling of short RNAs during fleshy fruit development reveals stage-specific sRNAome expression patterns. , 2011, The Plant journal : for cell and molecular biology.

[7]  A. Aravin,et al.  PIWI-interacting small RNAs: the vanguard of genome defence , 2011, Nature Reviews Molecular Cell Biology.

[8]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[9]  B. Meyers,et al.  Experimental design, preprocessing, normalization and differential expression analysis of small RNA sequencing experiments , 2011, Silence.

[10]  G. Friedlander,et al.  Failure of the Tomato Trans-Acting Short Interfering RNA Program to Regulate AUXIN RESPONSE FACTOR3 and ARF4 Underlies the Wiry Leaf Syndrome[C][W] , 2012, Plant Cell.

[11]  T. Tuschl,et al.  Mechanisms of gene silencing by double-stranded RNA , 2004, Nature.

[12]  Vincent Moulton,et al.  A toolkit for analysing large-scale plant small RNA datasets , 2008, Bioinform..

[13]  Nicolas Servant,et al.  A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis , 2013, Briefings Bioinform..

[14]  Webb Miller,et al.  CleaveLand: a pipeline for using degradome data to find cleaved small RNA targets , 2009, Bioinform..

[15]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[16]  Steven Henikoff,et al.  Spreading of silent chromatin: inaction at a distance , 2006, Nature Reviews Genetics.

[17]  Yves Van de Peer,et al.  Evidence that microRNA precursors, unlike other non-coding RNAs, have lower folding free energies than random sequences , 2004, Bioinform..

[18]  Kuniaki Saito,et al.  A Slicer-Mediated Mechanism for Repeat-Associated siRNA 5' End Formation in Drosophila , 2007, Science.

[19]  Matthew B. Stocks,et al.  CoLIde: a bioinformatics tool for CO-expression-based small RNA Loci Identification using high-throughput sequencing data. , 2013, RNA biology.

[20]  Charlotte Soneson,et al.  A comparison of methods for differential expression analysis of RNA-seq data , 2013, BMC Bioinformatics.

[21]  N. Rajewsky,et al.  Discovering microRNAs from deep sequencing data using miRDeep , 2008, Nature Biotechnology.

[22]  Olivier Voinnet,et al.  Initiation and Maintenance of Virus-Induced Gene Silencing , 1998, Plant Cell.

[23]  Janet Kelso,et al.  PatMaN: rapid alignment of short sequences to large databases , 2008, Bioinform..

[24]  C. Mason,et al.  Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data , 2013, Genome Biology.

[25]  Gang Wu,et al.  SGS3 and SGS2/SDE1/RDR6 are required for juvenile development and the production of trans-acting siRNAs in Arabidopsis. , 2004, Genes & development.

[26]  Matthew B. Stocks,et al.  Comprehensive processing of high-throughput small RNA sequencing data including quality checking, normalization, and differential expression analysis using the UEA sRNA Workbench. , 2017, RNA.

[27]  D. Bartel,et al.  Endogenous siRNA and miRNA Targets Identified by Sequencing of the Arabidopsis Degradome , 2008, Current Biology.

[28]  W. Cleveland LOWESS: A Program for Smoothing Scatterplots by Robust Locally Weighted Regression , 1981 .

[29]  Peter F. Stadler,et al.  RNA folding with hard and soft constraints , 2016, Algorithms for Molecular Biology.

[30]  A. Fire,et al.  Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans , 1998, Nature.

[31]  D. Baulcombe,et al.  miRNAs control gene expression in the single-cell alga Chlamydomonas reinhardtii , 2007, Nature.

[32]  Daniel W. A. Buchan,et al.  The tomato genome sequence provides insights into fleshy fruit evolution , 2012, Nature.

[33]  B. Meyers,et al.  Phased, Secondary, Small Interfering RNAs in Posttranscriptional Regulatory Networks[OPEN] , 2013, Plant Cell.

[34]  M. Robinson,et al.  A scaling normalization method for differential expression analysis of RNA-seq data , 2010, Genome Biology.

[35]  Ann E. Loraine,et al.  Genoviz Software Development Kit: Java tool kit for building genomics visualization applications , 2009, BMC Bioinformatics.

[36]  Fatih Ozsolak,et al.  RNA sequencing: advances, challenges and opportunities , 2011, Nature Reviews Genetics.

[37]  Matthew B. Stocks,et al.  PAREsnip: a tool for rapid genome-wide discovery of small RNA/target interactions evidenced through degradome sequencing , 2012, Nucleic acids research.

[38]  Sandrine Dudoit,et al.  Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments , 2010, BMC Bioinformatics.

[39]  O. Voinnet Origin, Biogenesis, and Activity of Plant MicroRNAs , 2009, Cell.

[40]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[41]  Vincent Moulton,et al.  The UEA sRNA workbench: a suite of tools for analysing and visualizing next generation sequencing microRNA and small RNA datasets , 2012, Bioinform..

[42]  R. Tibshirani,et al.  Normalization, testing, and false discovery rate estimation for RNA-sequencing data. , 2012, Biostatistics.

[43]  Ronny Lorenz,et al.  Predicting RNA structure: advances and limitations. , 2014, Methods in molecular biology.

[44]  Ana Kozomara,et al.  miRBase: annotating high confidence microRNAs using deep sequencing data , 2013, Nucleic Acids Res..

[45]  W. Huber,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets , 2011 .

[46]  Emily M. Strait,et al.  The arabidopsis information resource: Making and mining the “gold standard” annotated reference plant genome , 2015, Genesis.

[47]  V. Moulton,et al.  Diverse correlation patterns between microRNAs and their targets during tomato fruit development indicates different modes of microRNA actions , 2012, Planta.

[48]  D. Bartel,et al.  MicroRNAS and their regulatory roles in plants. , 2006, Annual review of plant biology.

[49]  B. Williams,et al.  Mapping and quantifying mammalian transcriptomes by RNA-Seq , 2008, Nature Methods.

[50]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[51]  S. Luo,et al.  Global identification of microRNA–target RNA pairs by parallel analysis of RNA ends , 2008, Nature Biotechnology.

[52]  Pamela J Green,et al.  Construction of Parallel Analysis of RNA Ends (PARE) libraries for the study of cleaved miRNA targets and the RNA degradome , 2009, Nature Protocols.

[53]  Hongwei Guo,et al.  Suppression of endogenous gene silencing by bidirectional cytoplasmic RNA decay in Arabidopsis , 2015, Science.

[54]  Ana Kozomara,et al.  Reducing ligation bias of small RNAs in libraries for next generation sequencing , 2012, Silence.

[55]  Franck Vazquez,et al.  Endogenous trans-acting siRNAs regulate the accumulation of Arabidopsis mRNAs. , 2004, Molecular cell.

[56]  Tanya Z. Berardini,et al.  The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools , 2011, Nucleic Acids Res..

[57]  Shu-Hsing Wu,et al.  Bioinformatic prediction and experimental validation of a microRNA-directed tandem trans-acting siRNA cascade in Arabidopsis , 2007, Proceedings of the National Academy of Sciences.

[58]  Anton J. Enright,et al.  MapMi: automated mapping of microRNA loci , 2010, BMC Bioinformatics.

[59]  Ji Hoon Ahn,et al.  AGO1-miR173 complex initiates phased siRNA formation in plants , 2008, Proceedings of the National Academy of Sciences.

[60]  Adam M. Gustafson,et al.  microRNA-Directed Phasing during Trans-Acting siRNA Biogenesis in Plants , 2005, Cell.

[61]  J. Zhai,et al.  Rapid construction of parallel analysis of RNA end (PARE) libraries for Illumina sequencing. , 2014, Methods.

[62]  W. Fraser,et al.  An improved protocol for small RNA library construction using High Definition adapters , 2015 .