Meta-analysis of small RNA-sequencing errors reveals ubiquitous post-transcriptional RNA modifications

Recent advances in DNA-sequencing technology have made it possible to obtain large datasets of small RNA sequences. Here we demonstrate that not all non-perfectly matched small RNA sequences are simple technological sequencing errors, but many hold valuable biological information. Analysis of three small RNA datasets originating from Oryza sativa and Arabidopsis thaliana small RNA-sequencing projects demonstrates that many single nucleotide substitution errors overlap when aligning homologous non-identical small RNA sequences. Investigating the sites and identities of substitution errors reveal that many potentially originate as a result of post-transcriptional modifications or RNA editing. Modifications include N1-methyl modified purine nucleotides in tRNA, potential deamination or base substitutions in micro RNAs, 3′ micro RNA uridine extensions and 5′ micro RNA deletions. Additionally, further analysis of large sequencing datasets reveal that the combined effects of 5′ deletions and 3′ uridine extensions can alter the specificity by which micro RNAs associate with different Argonaute proteins. Hence, we demonstrate that not all sequencing errors in small RNA datasets are technical artifacts, but that these actually often reveal valuable biological insights to the sites of post-transcriptional RNA modifications.

[1]  Gregory J. Hannon,et al.  Sorting of Small RNAs into Arabidopsis Argonaute Complexes Is Directed by the 5′ Terminal Nucleotide , 2008, Cell.

[2]  A. Brennicke,et al.  The process of RNA editing in plant mitochondria. , 2008, Mitochondrion.

[3]  Shivakundan Singh Tej,et al.  Elucidation of the Small RNA Component of the Transcriptome , 2005, Science.

[4]  Ryan D. Morin,et al.  Comparative analysis of the small RNA transcriptomes of Pinus contorta and Oryza sativa. , 2008, Genome research.

[5]  A. Hinnebusch,et al.  The essential Gcd10p-Gcd14p nuclear complex is required for 1-methyladenosine modification and maturation of initiator methionyl-tRNA. , 1998, Genes & development.

[6]  Xiaofeng Cao,et al.  ARGONAUTE4 Control of Locus-Specific siRNA Accumulation and DNA and Histone Methylation , 2003, Science.

[7]  C. Norbury,et al.  The Cid1 poly(U) polymerase. , 2008, Biochimica et biophysica acta.

[8]  Brenda L Bass,et al.  RNA editing by adenosine deaminases that act on RNA. , 2002, Annual review of biochemistry.

[9]  M. Bakhanashvili,et al.  Fidelity of the reverse transcriptase of human immunodeficiency virus type 2 , 1992, FEBS letters.

[10]  I. Mian,et al.  Identification of the yeast cytidine deaminase CDD1 as an orphan C-->U RNA editase. , 2001, Nucleic acids research.

[11]  M. Bakhanashvili,et al.  Fidelity of the RNA-dependent DNA synthesis exhibited by the reverse transcriptases of human immunodeficiency virus types 1 and 2 and of murine leukemia virus: mispair extension frequencies. , 1992, Biochemistry.

[12]  T. Tuschl,et al.  Identification of Novel Genes Coding for Small Expressed RNAs , 2001, Science.

[13]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[14]  Xuemei Chen,et al.  Degradation of microRNAs by a Family of Exoribonucleases in Arabidopsis , 2008, Science.

[15]  G. Crooks,et al.  WebLogo: a sequence logo generator. , 2004, Genome research.

[16]  E. Phizicky,et al.  Identification of the yeast gene encoding the tRNA m1G methyltransferase responsible for modification at position 9. , 2003, RNA.

[17]  Yahua Chen,et al.  Effect of 3' terminal adenylic acid residue on the uridylation of human small RNAs in vitro and in frog oocytes. , 2000, RNA.

[18]  P. Seeburg,et al.  Modulation of microRNA processing and expression through RNA editing by ADAR deaminases , 2006, Nature Structural &Molecular Biology.

[19]  Patricia P. Chan,et al.  GtRNAdb: a database of transfer RNA genes detected in genomic sequence , 2008, Nucleic Acids Res..

[20]  P. Green,et al.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.

[21]  C. Helliwell,et al.  A diverse set of microRNAs and microRNA-like small RNAs in developing rice grains. , 2008, Genome research.

[22]  H. Goodman,et al.  Uridine Addition After MicroRNA-Directed Cleavage , 2004, Science.

[23]  H. Ebhardt,et al.  Extensive 3' modification of plant small RNAs is modulated by helper component-proteinase expression. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[24]  V. Ambros,et al.  An Extensive Class of Small RNAs in Caenorhabditis elegans , 2001, Science.

[25]  The primary structure of wheat germ tRNAArg--the substrate for arginyl-tRNAArg:protein transferase. , 1986, Biochimie.

[26]  Malachi Griffith,et al.  In-depth characterization of the microRNA transcriptome in a leukemia progression model. , 2008, Genome research.

[27]  D. Nathans,et al.  Amino acid transfer from sRNA to microsome. 2. Isolation of a heat-labile factor from liver supernatant. , 1960, Biochimica et biophysica acta.

[28]  Xiaofeng Cao,et al.  Role of Arabidopsis ARGONAUTE4 in RNA-Directed DNA Methylation Triggered by Inverted Repeats , 2004, Current Biology.

[29]  D. Söll Enzymatic modification of transfer RNA. , 1971, Science.

[30]  J. Bujnicki,et al.  Conserved amino acids in each subunit of the heteroligomeric tRNA m1A58 Mtase from Saccharomyces cerevisiae contribute to tRNA binding , 2007, Nucleic acids research.

[31]  Kaizhong Zhang,et al.  RNA Secondary Structure Prediction Via Energy Density Minimization , 2006, RECOMB.

[32]  L. Lim,et al.  An Abundant Class of Tiny RNAs with Probable Regulatory Roles in Caenorhabditis elegans , 2001, Science.

[33]  Jeffrey W. Habig,et al.  miRNA editing--we should have inosine this coming. , 2007, Molecular cell.

[34]  T. D. Schneider,et al.  Sequence logos: a new way to display consensus sequences. , 1990, Nucleic acids research.

[35]  T. Shikanai,et al.  RNA editing in plant organelles: machinery, physiological function and evolution , 2006, Cellular and Molecular Life Sciences CMLS.

[36]  Ina Ruck,et al.  USA , 1969, The Lancet.

[37]  F. Lipmann,et al.  Amino acid transfer from sRNA to microsome. 1. Activation by sulfhydryl compounds. , 1960, Biochimica et biophysica acta.

[38]  D. Bartel,et al.  A diverse and evolutionarily fluid set of microRNAs in Arabidopsis thaliana. , 2006, Genes & development.

[39]  M. Bakhanashvili,et al.  The fidelity of the reverse transcriptases of human immunodeficiency viruses and murine leukemia virus, exhibited by the mispair extension frequencies, is sequence dependent and enzyme related , 1993, FEBS letters.

[40]  Kay C. Wiese,et al.  Ebbie: automated analysis and storage of small RNA cloning data using a dynamic web server , 2006, BMC Bioinformatics.

[41]  P Green,et al.  Base-calling of automated sequencer traces using phred. II. Error probabilities. , 1998, Genome research.

[42]  Martin M Matzuk,et al.  Mouse let-7 miRNA populations exhibit RNA editing that is constrained in the 5'-seed/ cleavage/anchor regions and stabilize predicted mmu-let-7a:mRNA duplexes. , 2008, Genome research.

[43]  Xuemei Chen,et al.  Methylation Protects miRNAs and siRNAs from a 3′-End Uridylation Activity in Arabidopsis , 2005, Current Biology.

[44]  Stijn van Dongen,et al.  miRBase: tools for microRNA genomics , 2007, Nucleic Acids Res..

[45]  D. Bartel,et al.  MicroRNAS and their regulatory roles in plants. , 2006, Annual review of plant biology.

[46]  James R. Knight,et al.  Genome sequencing in microfabricated high-density picolitre reactors , 2005, Nature.

[47]  Vladimir Vacic,et al.  A probabilistic method for small RNA flowgram matching. , 2007, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[48]  K. Collins,et al.  Starvation-induced Cleavage of the tRNA Anticodon Loop in Tetrahymena thermophila* , 2005, Journal of Biological Chemistry.

[49]  T. D. Schneider,et al.  Information content of binding sites on nucleotide sequences. , 1986, Journal of molecular biology.

[50]  J. Jurka,et al.  Distinct catalytic and non-catalytic roles of ARGONAUTE4 in RNA-directed DNA methylation , 2006, Nature.

[51]  H. Becker,et al.  THE U.V. PHOTOCHEMISTRY OF CYTIDYLIC ACID , 1967, Photochemistry and photobiology.

[52]  Ruiqiang Li,et al.  SOAP: short oligonucleotide alignment program , 2008, Bioinform..

[53]  S. Eddy,et al.  tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. , 1997, Nucleic acids research.