Identifying small interfering RNA loci from high-throughput sequencing data

MOTIVATION Small interfering RNAs (siRNAs) are produced from much longer sequences of double-stranded RNA precursors through cleavage by Dicer or a Dicer-like protein. These small RNAs play a key role in genetic and epigenetic regulation; however, a full understanding of the mechanisms by which they operate depends on the characterization of the precursors from which they are derived. RESULTS High-throughput sequencing of small RNA populations allows the locations of the double-stranded RNA precursors to be inferred. We have developed methods to analyse small RNA sequencing data from multiple biological sources, taking into account replicate information, to identify robust sets of siRNA precursors. Our methods show good performance on both a set of small RNA sequencing data in Arabidopsis thaliana and simulated datasets. AVAILABILITY Our methods are available as the Bioconductor (www.bioconductor.org) package segmentSeq (version 1.5.6 and above).

[1]  S. Hammond,et al.  An RNA-directed nuclease mediates post-transcriptional gene silencing in Drosophila cells , 2000, Nature.

[2]  James R. Knight,et al.  Genome sequencing in microfabricated high-density picolitre reactors , 2005, Nature.

[3]  Vincent Moulton,et al.  Finding sRNA generative locales from high-throughput sequencing data with NiBLS , 2010, BMC Bioinformatics.

[4]  B. Williams,et al.  Mapping and quantifying mammalian transcriptomes by RNA-Seq , 2008, Nature Methods.

[5]  D. Bentley,et al.  Whole-genome re-sequencing. , 2006, Current opinion in genetics & development.

[6]  A. Mortazavi,et al.  Genome-Wide Mapping of in Vivo Protein-DNA Interactions , 2007, Science.

[7]  H. Vaucheret MicroRNA-Dependent Trans-Acting siRNA Production , 2005, Science's STKE.

[8]  Tanya Z. Berardini,et al.  The Arabidopsis Information Resource (TAIR): gene structure and function annotation , 2007, Nucleic Acids Res..

[9]  Lei Li,et al.  miRDeep-P: a computational tool for analyzing the microRNA transcriptome in plants , 2011, Bioinform..

[10]  E. Sontheimer,et al.  Origins and Mechanisms of miRNAs and siRNAs , 2009, Cell.

[11]  A. Mortazavi,et al.  Computation for ChIP-seq and RNA-seq studies , 2009, Nature Methods.

[12]  T. Tuschl,et al.  Mechanisms of gene silencing by double-stranded RNA , 2004, Nature.

[13]  Vincent Moulton,et al.  A toolkit for analysing large-scale plant small RNA datasets , 2008, Bioinform..

[14]  N. Rajewsky,et al.  Discovering microRNAs from deep sequencing data using miRDeep , 2008, Nature Biotechnology.

[15]  Jason S. Cumbie,et al.  Genome-Wide Profiling and Analysis of Arabidopsis siRNAs , 2007, PLoS biology.

[16]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[17]  Cole Trapnell,et al.  Computational methods for transcriptome annotation and quantification using RNA-seq , 2011, Nature Methods.

[18]  M. Luck,et al.  Genome sequencing , 1987, Nature.

[19]  Sandrine Dudoit,et al.  Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments , 2010, BMC Bioinformatics.

[20]  Thomas J. Hardcastle,et al.  baySeq: Empirical Bayesian methods for identifying differential expression in sequence count data , 2010, BMC Bioinformatics.

[21]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.